"The Long and Winding Road" is actually the title of yet another Beatles song, and a great one at that. Yet, I didn't wake up today thinking that I was going to post another log entry on this project, with that song title or any other song title in mind. Not at all. But I did decide to look at the x86 disassembly for the code that calculates the current position of the sun in the sky, so as to try to get an idea as to what sort of other platforms it might work on. Custom 6502? 328P? Z80 dream machine. 8080A on a genuine Altair? Or just another Arduino or Propeller project?
void solar_elements::compute_local_aspect()
{
004C2EC0 push ebp
004C2EC1 mov ebp,esp
004C2EC3 sub esp,270h
004C2EC9 push ebx
004C2ECA push esi
004C2ECB push edi
004C2ECC push ecx
004C2ECD lea edi,[ebp-270h]
004C2ED3 mov ecx,9Ch
004C2ED8 mov eax,0CCCCCCCCh
004C2EDD rep stos dword ptr es:[edi]
004C2EDF pop ecx
004C2EE0 mov dword ptr [ebp-8],ecx
double solar_time, ra_base, ecliptic_aspect;
// estimate ecliptic aspect
m_gmt_seconds = 3600.0*(m_hour-m_daylight_savings-m_time_zone)+60.0*m_minute+m_second;
004C2EE3 mov eax,dword ptr [this]
004C2EE6 mov ecx,dword ptr [this]
004C2EE9 mov edx,dword ptr [eax+5Ch]
004C2EEC sub edx,dword ptr [ecx+6Ch]
004C2EEF mov eax,dword ptr [this]
004C2EF2 sub edx,dword ptr [eax+68h]
004C2EF5 mov dword ptr [ebp-26Ch],edx
004C2EFB fild dword ptr [ebp-26Ch]
004C2F01 fmul qword ptr [__real@40ac200000000000 (682830h)]
004C2F07 mov ecx,dword ptr [this]
004C2F0A fild dword ptr [ecx+60h]
004C2F0D fmul qword ptr [__real@404e000000000000 (684210h)]
004C2F13 faddp st(1),st
004C2F15 mov edx,dword ptr [this]
004C2F18 fiadd dword ptr [edx+64h]
004C2F1B mov eax,dword ptr [this]
004C2F1E fstp qword ptr [eax+48h]
solar_time = m_gmt_seconds*(1.0/3600.0) - m_longitude*(1.0/15.0);
004C2F21 mov eax,dword ptr [this]
004C2F24 fld qword ptr [eax+48h]
004C2F27 fmul qword ptr [__real@3f323456789abcdf (682710h)]
004C2F2D mov ecx,dword ptr [this]
004C2F30 fld qword ptr [ecx+40h]
004C2F33 fmul qword ptr [__real@3fb1111111111111 (684208h)]
004C2F39 fsubp st(1),st
004C2F3B fstp qword ptr [solar_time]
m_julian = julian_day (m_month,m_day,m_year) - julian_day (3,20,1900) + solar_time*(1.0/24.0);
004C2F3E mov eax,dword ptr [this]
004C2F41 mov ecx,dword ptr [eax+50h]
004C2F44 push ecx
004C2F45 mov edx,dword ptr [this]
004C2F48 fild dword ptr [edx+58h]
004C2F4B push ecx
004C2F4C fstp dword ptr [esp]
004C2F4F mov eax,dword ptr [this]
004C2F52 mov ecx,dword ptr [eax+54h]
004C2F55 push ecx
004C2F56 mov ecx,dword ptr [this]
004C2F59 call solar_elements::julian_day (4C2DF0h)
004C2F5E push 76Ch
004C2F63 push ecx
004C2F64 fld dword ptr [__real@41a00000 (67BF70h)]
004C2F6A fstp dword ptr [esp]
004C2F6D push 3
004C2F6F mov ecx,dword ptr [this]
004C2F72 fstp qword ptr [ebp-270h]
004C2F78 call solar_elements::julian_day (4C2DF0h)
004C2F7D fsubr qword ptr [ebp-270h]
004C2F83 fld qword ptr [solar_time]
004C2F86 fmul qword ptr [__real@3fa5555555555555 (684200h)]
004C2F8C faddp st(1),st
004C2F8E call _ftol2_sse (603840h)
004C2F93 mov edx,dword ptr [this]
004C2F96 mov dword ptr [edx+70h],eax
.................... and so on ................
O.K., if you made it this far, it should be obvious that the original C++ code has no conditional logic or loops, at least for this particular function. And a quick look at the hexadecimal values suggests that, other than function calls into the tensor library, this whole thing only needs about 800 bytes or so. So I am feeling very good about the possibility that this sort of thing could be run out of ROM on a custom 6502-based system, if I went that route. But then again, maybe a 328P would be a good candidate, with its built-in 32K of Flash EPROM, and maybe just enough RAM to get the job done. Maybe. Obviously, I would need to look into issues of such things as how the available precision of the provided floating point libraries would affect things. Yet there would also be workarounds for that.
Or maybe I could rewrite this in raw FORTH?
Or else you know what they say, "Travelling through hyperspace isn't like crop dusting back home", since "without branchless programming, you will never be able to simultaneously compute 500,000 objects in parallel on your GPU, as you are trying to fly through an asteroid field."
Feeling pretty good about what the disassembly looks like in any case. So, I have gone ahead and made some revisions to the C++ source code of the solar_elements::compute_local_aspect() function.
void solar_elements::compute_local_aspect()
{
double solar_time, ra_base, ecliptic_aspect;
// estimate ecliptic aspect
m_gmt_seconds = 3600.0*(m_hour-m_daylight_savings-m_time_zone)+60.0*m_minute+m_second;
solar_time = m_gmt_seconds*(1.0/3600.0);
m_julian = julian_day (m_month,m_day,m_year) - julian_day (3,20,1900) + solar_time*(1.0/24.0);
ecliptic_aspect = fmodf(m_julian,365.242191)*(TWO_PI/365.242191);
_vector ecliptic_aspect_vector = tensor::rotate_z (ecliptic_aspect)*_vector(1,0,0);
ra_base = ecliptic_aspect*(24.0/TWO_PI);
tensor SIDT = tensor::rotate_z ((solar_time-ra_base)*(TWO_PI/24.0)- TO_RADIANS(m_longitude));
_vector local_xyz;
polar_orientation p;
p.m_theta = 0.0;// TO_RADIANS(m_longitude);
p.m_phi = TO_RADIANS(m_latitude);
local_xyz = p.get_xyz ();
tensor ROTL = tensor::rotator(
_vector::normalize (local_xyz.cross(_vector(0,0,1))),p.m_phi-PI_OVER_TWO
);
// m_geocentric_solar_vector = m_ROTX*ecliptic_aspect_vector;
// _vector siderial_solar_vector = SIDT*m_geocentric_solar_vector;
// _vector local_solar_vector = ROTL*siderial_solar_vector;
tensor GLSUN = ROTL*SIDT*m_ROTX;
m_local_solar_vector = GLSUN*ecliptic_aspect_vector;
polar_orientation q = polar_orientation::from_axis (m_local_solar_vector);
m_el = TO_DEGREES(q.m_phi);
m_az = fmodf(360.0+TO_DEGREES(q.m_theta),360.0);
}
Hopefully, this makes a bit more sense, and I think it may correct some issues in the earlier code. Of course, if you look closely, there is now a place where instead of doing a bunch of vector operations where I produce a series of vectors by multiplying a series to tensors, to produce the next vector in the series; I instead precompute the product of the three tensors ROTL, SIDT, and ROTX to produce the tensor product GLSUN, which can then be used to transform the ecliptic aspect vector to, hopefully obtain the correct position of the sun in the sky for any given longitudinal, latitude, as well as date and time.
Of course, if multiplying a tensor by a tensor in this context needs 27 multiplies, then it looks like doing this part that way will need 27+27+9 = 63 multiplications to transform the ecliptic aspect vector, whereas by doing it by performing a series of tensor by vector multiplications would only use 9+9+9 = 27 multiplications. Yet, if there is a way to reuse the computed tensor GLSUN when computing individual planets, as a part of a more general heliocentric to local transform, then hopefully we get those multiplies back, if we can get it down to just 9 multiplies to transform each celestial vector, plus any required additions of course. So, I think that the efficiency tradeoff is going to be worth it, besides appearing more elegant, for now.
glgorman
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.