During my last run, I briefly saw 404 errors but couldn't make sense of them because the script output was scrambled between different commands.
These last days/weeks, I've noticed more transient errors on hackaday.io and I have to find a way to wait and retry if the page fails to load the first time...
Until then, I made a different version with all parallelising removed and the output is also saved to a log file, for easy grepping. The new file backup_serial.sh is slower but apparently safer.
Actually, 404 errors are becoming endemic. One script run can get a few or more and there is no provision yet to retry... I have to code this because several independent runs are required to get a good sampling of the data.
Some wget magic should be done ...
New twist !
No 404 error this time. The page migh load but the contents will be "something is wrong. please reload the page." I should have made a screenshot and saved the page to extract its textual signature...
I must find a way to restart the download when this error occurs too.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.
Personally, I'm not too fond of parallelizing in scripts... it makes output dang-near unparseable... But there may be ways around that... I recently learned about e.g. 'tee' which could probably be used to output each parallel process into a different log-file... Still, things like watching the scroll-bar in download-processes is nice. Sometimes I create new console windows for separate processes and redirect the output to 'em, but that's a bit OS-specific and a bit confusing because the process isn't actually *running* there, it's just outputting its data there. Oh, and parallelizing makes 'resume' dang-near impossible even if you plan to write a new script to handle a specific failure... But I'm no expert.
Are you sure? yes | no
If you accept the slower operation, the latest script is pretty chill :-)
Now I might have to redesign "$WGET" to make it a function that calls wget repeatedly a few times in case of 404...
But I'm lazy and the latest run has encountered no error so it will be for another day.
Are you sure? yes | no