Close
0%
0%

MIDI transcriber

Transcribe MIDI to readable music notation

Similar projects worth following

Lions briefly used Finale for Mac, in 2017.  It was clunky but the only program which could convert MIDI directly to lion readable music notation.  Thoughts immediately turned to how a better version could be made from scratch.  Every other program was a type setter.  Type setting music is extremely slow, entailing dragging or clicking on every single note's position.  Entering it by playing is much faster.

  • MIDI capture & bars

    lion mclionhead12/23/2025 at 05:10 0 comments

    By the time 1 month had gone by, it was clear that the peanuts xmas song would not be transcribed in time.  Lion readable music notation from MIDI is a hard problem.

    Reverse engineering the MIDI protocol from hex dumps instead of formulating an AI prompt, revealed the start codes are ored with 0x80.  All the other data is limited to 0x00-0x7f.  The packets are variable length.  Sometimes they're 4 bytes ending in 0.  Sometimes they're 3 bytes.

    The 1st step was using the right pedal to go forwards 1 note & the left pedal to go backwards 1 note.  The right pedal needed debouncing in the form of 2 analog levels.  The left pedal seems to have hysteresis of its own.

    Then to do cross development on raspberry pi, it became clear it needed to switch between single monitor, dual monitor, & 3 page single monitor mode at runtime.  3 page single monitor remanes a notional mode if lions can ever afford enough space. 

    Managed to get some peanuts transcribed by capturing MIDI keypresses.  A lot of behaviors have yet to be fixed, manely more 8va corner cases, undo stack corner cases, accidental corner cases. 

    --------------------------------------------------------------------------------------------------------------------------------------

     It quickly became clear that it needed some minimal indication of timing or measures.  The reason we have bars is the same reason serial data has start bits.  The trick is an object which spans all the staves.  The bars could be in a 3rd staff, which leaves a need for the user to manually shift them sideways without a time signature.  We have rests to shift notes sideways.

    The top staff could be designated as the bar staff & they would follow only edits in that staff.  The problem is the time values in the bottom staff still need to skip the bar time.  The bars in a 3rd staff could just follow the time values in the top staff.  There's no dragging, dropping, cutting or pasting of objects in the score.  There are only rests for horizontal alignment.

    The easiest solution ended up being separate bar objects in all the staves, with the polygons & time shifting based on the top staff.  All the edit operations need a unique, quirky behavior to update the bars & 8vas.  Edits below the 1st staff would shift everything besides the bars while edits in the 1st staff would shift the bars in all the staves.  Deletion & insertion of the bars themselves would shift everything in all the staves.  

    Ended up requiring the user to specify when edits shifted the bars.  Adding a note in 1 staff would entail checking the option, adding the note in 1 staff, then unchecking the option & adding a rest in the other staff.  The mane problem is the lack of any timing information preventing auto computation of the bar positions.  It might need a set of general drag & drop tools.

    The lack of measure wrapping means it can't draw bars on the sides.  Measure wrapping would create the problems of a measure that's too wide to fit on a single line or a case of no bars being defined.  Finale required the user to call a reformatting command to wrap the measures.

    ---------------------------------------------------------------------------------------------------------------------------

    The other big problem was playback of the source audio file from preset times.  This still needs a full computer.  No phone audio player has quite the same capability.  Minidisc recorders could create preset times, but modern phone players are only designed for consumption.  Then the phone output needs to be mixed with the piano output.  The lion kingdom would be so lucky if its minidisc recorder still worked. Went through 2 & all the spare parts.  The replacement gears suffered from some age related degradation.

    1 idea is to have the https://hackaday.io/project/178292-piano-signal-processor...

    Read more »

  • Changed file indicator, key signatures

    lion mclionhead12/17/2025 at 00:59 0 comments

    Updating the changed file indicator normally requires massaging the undo stack after every save so the current level is unchanged & all the other levels are changed.  Then every operation needs to redraw the changed file indicator.  For load operations, you have to either reset the entire undo stack, manetane a 2nd unchanged level in the undo stack, or just deal with a previous save operation now showing a changed file.  It's much easier to make load erase the undo stack & make every undo operation set the file to changed.

    Next came key signatures.  The difficulties there are handling key changes.  For proper notation, it has to draw a double bar & naturals for the previous key.  Of course, in a reduced notation it could stand to just draw the new signature in a different color.

    You have to add the key signature to both cleffs manually.  The trick is C major doesn't have a key signature, so key changes end up requiring naturals or it's another page we have to take from Music Construction Set by amending conventional notation.  Not sure how Finale handled the user entering a key change from C to C.  It would have needed a placeholder.  For now, key changes to C major are not supported.

    At this point, having certain objects with zero length & overlapping times started becoming more trouble than it's worth.  The trick is the cursor has to overlap the key signature to record the 1st note after the key signature.  If it's after the key signature, it inserts a rest & records the 1st note at beat 1.  If the time value is ever used for actual timing, it could be smart enough to factor in the object type in the playback time.  It could also have a distinct playback time/length & display time/length.  Every object needs to have a nonzero length in order to position the cursor after the last object.

    During this process, switched to XOR to indicate the active tool.  Vintage XOR interfaces are a lot more appealing now & for a user base of 1.

    It's not known if bars are required for the minimal notation system to be legible.  Lions would try going without & move on to MIDI capture.   It requires developing on the obsolete 800Mhz raspberry pi.  Compiling on a 166Mhz Cyrix went a lot faster than an 800Mhz raspberry pi.

    With MIDI capture, you're inserting notes into the current beat to build up a chord, then manually advancing a beat or rewinding  a beat with the page buttons.  After capturing a ways on 1 staff with just the keyboard & page buttons, rewind & start capturing the other staff with the trackpad.  Then insert rests to synchronize the 2 staffs.

    Of course, it would be nice to undo mistakes without the trackpad.  1 of the pedals could make it erase notes by MIDI code.

  • User defined 8va, rests, delete tool

    lion mclionhead12/13/2025 at 23:53 0 comments

    8va would entail a drawing tool for the user to pick a start & another drawing tool for the user to pick an end point.   It picks the direction from which side of the staff you're on.

    Kind of an extension of Music Construction Set in that MCS didn't have an endpoint tool.  It tries to draw indicators for where the markers are going, but it doesn't draw ghost images of the markers like the annotation tools.  The markers shift the score around, because of the ledger lines.  It would be quite noisy if it constantly shifted the score to draw a ghost.

    Then came a delete tool for individual objects.  It would eventually be necessary to delete user defined accidentals while retaining the default accidentals.  That would entail dereferencing the note from the accidental bitmap. 

    It's pretty hard to see what object is selected for deletion, but it tries with crosshairs & an xor box.

    Helas, when it deletes notes, it also creates discontinuities in the beat numbers.  If the user adds a bass note after deleting a treble note, it's going to unpredictably shift the treble notes right because they still have the original times.  It also needs a way to shift notes right without accompaniment.  If it can't draw rests & note durations, it needs some minimal indication of the space for every beat.

    It could either allow notes to have any starting time or it could require rests in all blank areas.  To simplify the programming, it should fill all gaps with rests & have only durations of 1 beat.

    Settled on a half rest symbol to fill all the gaps, since it was less obtrusive.  There's a rest insertion tool to shift notes right.  Inserting a rest after a gap causes all the intervening rests to be inserted & creates a jarring shift in the score.

    Helas, there's kind of a mess with some objects like 8va's having a nonzero duration that needs to overlap notes & other objects like cleffs & key signatures needing a zero duration which doesn't overlap notes.  Multiple objects in a single beat need to be sorted based on enum number.  Then, by setting the enum numbers, you can get bars, cleffs, key signatures, notes & rests in the right order.

    Gone are any snazzy preview graphics of the music reader.  Most of the music reader polygon routines ended up unused.  Only the freepaw tool saw any significant use.  They were more of a realization of a childhood goal of creating a bitmap drawing program. 

  • Line wrapping

    lion mclionhead12/11/2025 at 07:25 0 comments

    Managed to get minimal line wrapping to work.  That requires a lot of non manetainable, janky, spaghetti.  Cleffs, key signatures, 8va's have to be extended to each line.  There are glitches.  The mouse pointing is really bad.  The replicated cleffs on each line are off by a few pixels, but its far enough along to start letting more glitches slide in order to get somewhere functional.

    There's definite contention on the issue of letting bugs slide in order to get to minimal functionality.  On the 1 paw, we're supposed to be doing test driven development.  On the other paw, we're supposed to be making minimum viable products.  Lions have traditionally favored letting bugs slide to get to the finish line.  It was always about the requirements constantly changing, the low odds of it being finished, the low odds of anyone ever using it.  When working with other animals or making a high profile program with frequent uploads, you always have to keep up with the bugs, can't be finish line focused because in that case your quality of work is being judged.

    8va line wrapping was the next big one.  Professionally engraved music has the 8va's extend off the end of each line & restart on the next line without any special ending on the previous line.  Take every win you can get.

    Undo & redo would be the last step before finally moving into user defined notation.

  • More staves

    lion mclionhead12/02/2025 at 06:21 0 comments

    Got it to show arbitrary numbers of staves.  The mane trick is aligning the beats in both staves.  Currently, it's hard coded to create 2.  Then came 8va symbols, which were a brute force coding job. 

    Exercised it drawing lowest A & highest C.  It's going to need adaptive staff spacing, but it's going to be rarely used.

    All the drawing used X primatives & pixmaps instead of the more modern method of drawing on a bitmap & blitting or using a modern library.

    The next step was persistent storage.  Lacking a keyboard or desire to type with a mouse, it just automatically creates a filename.  The user has to ssh in & rename it.  This is what the headless sound recorders have always done & it works well.  https://hackaday.io/project/28716-ultimate-4-channel-audio-recorder

    Maybe it's time to make a web based file manager for all these headless programs, like webphone.

    Banging out the persistent storage reminded lions that the music reader program has been on raspberry pi for 5 years & it uses the same GUI library as Cinelerra.   Also, ffmpeg was easy to build on raspberry pi.   Cinelerra could easily compile on a 64 bit raspberry pi but it would need a KVM.

    Just text files with automatically created filenames.

    Next came the insertion point & clipboard operations.  Insertion point operations comprise thousands of things. 

    The finale videos all showed a solid, unblinking cursor without XOR.  Some of the big operations were deleting single beats, cursor key navigation, mouse navigation, changing staffs & beats, wrapping around the end & beginning of the score, reducing the length of an 8va when deleting a beat.

    Music notation with line wrapping is such a difficult problem, there's no other way to do it but grinding away at diabolical software constructs or in today's lingo, diabolical prompt constructs.  When in doubt, creating more data structures tends to unblock the process.  It's most often necessary to reread different sections of arrays & search in both directions.

    To keep things moving, usage would focus on just inserting & deleting single notes at a time instead of full clipboard operations.  Music construction set avoided the problem by drawing a single infinitely wide line.  To be lion readable, it needs line wrapping & paginating.  Noted MCS had all the 8vas extend to the next bar. The notes were only drawn for X < 255 & the play head was indicated by a carrot.

    Important to note that the resolution imposed by the ancient laptop monitors is comparable to 80's VGA.

    For clipboard operations, the finale examples showed blue highlighted regions without XOR.  The mane problem with clipboard operations is selecting regions with variable numbers of staves, multiple lines, & pages.   It's not known if the score is going to advance by pages or lines.  The music reader did fine with no clipboard operations or region selections for its annotations.

  • Lilypond concept

    lion mclionhead11/30/2025 at 01:17 0 comments

    It turns out there are many standalone programs which convert MIDI to an input file for lilypond.  

    https://lilypond.org/doc/v2.24/Documentation/usage/invoking-midi2ly

    It's unknown how well they work.  The MIDI converters still have to perform automatic accidental assignment.

    The pipeline would begin with MIDI data hacked to produce shortpaw notation.  Lilypond would be called to output a PNG image.  It would redraw the entire captured score after every MIDI event.  The only missing pieces are extents of all the notes for graphical editing & drawing of an insertion point.

    https://lilypond.org/doc/v2.24/Documentation/usage/configuring-the-system-for-point-and-click

    There is some support for hyperlinking its PDF & SVG output.  That seeds the output files with coordinates of each note, in a roundabout way, but you have to render PDF or SVG output.

    https://lilypond.org/doc/v2.24/Documentation/notation-big-page.html#graphic

    It does support drawing polygons in arbitrary colors to indicate the insertion point.  Arbitrary notes can be colored to indicate where an edit would be applied.  Navigation could be through arrow keys.

    It's unknown if it's fast enough for interactive MIDI capturing an entire score & cursoring through all the notes.  It seems the pipeline would have to render 1 measure at a time & resort to custom software for line wrapping.  That would only be practical if all the notes were the same duration. 

    Most of the lilypond example scores are on https://www.mutopiaproject.org/instruments.html They were manely manually transcribed 10 years ago.  There are a lot of version incompatibilities.

    It takes 2.5 seconds to render the 4 page rach prelude in C# minor on a 2.4Ghz ryzen.  It takes 1 second to render 1 page examples, 2.6 seconds to render the 6 page revolutionary etude.  That would be pretty slow for incremental MIDI capture.

    Not sure why it's so slow.  It might be rendering every bezier curve in the notation font files.  Someone made a lilypond server 10 years ago.

    https://github.com/lyp-packages/lys

    Similarly noting the slowness.  The claimed speed improvement still wasn't very compelling.

    https://lilypond.org/easier-editing.html

    The interactive front ends all seem to use their own low fidelity music renderers for graphical editing & use lilypond for a high quality final output.  At this point, what lions have done from scratch could still be useful for the capturing stage but it could never produce a final score.

    The only way to convert that into a final score would be to export a lilypond file.  Of course, there would be no way to insert further MIDI data after it's in lilypond format unless lilypond was also used for the capturing.  Still not practical if it's rendering 1 measure at a time.  After applying timing, computing all the measures would be a rough go.  Amendments like that might be rare & minor enough to not require reloading in the MIDI capture program.

    Helas, the raspberry pi's in the music display are very slow & would have to be reflashed to compile lilypond.  Lions aren't financially prepared to upgrade.  It would entail a 4k monitor.  For the MIDI capture, it has to do a lot of stuff outside of lilypond, basically most of what it does now with accidentals, staffs, key signatures, & an insertion point.  That wouldn't go way.

  • MIDI transcriber kickoff

    lion mclionhead11/29/2025 at 08:42 0 comments

    The MIDI transcriber would be an extension of the previously written https://hackaday.io/project/179967-lcd-music-display music display, since lions never expect to have a piano & a PC accessible at the same time.

    The lion kingdom had a novel shortpaw notation system in mind for over 30 years, whereby all the notes were whole notes on an ordinary staff, timing wasn't indicated anywhere.  It was just enough to play a piece with memorized timing but not as cryptic as an ordinary MIDI graph, not as minimal as a lead sheet.  Timing could be manually set in a 2nd pass.  The trick was getting the music display into a notation capture mode & developing the interface.

    Notation capture would most likely be a toggle in the music reader menu.  It would be a totally separate interface. The music reader would perform MIDI capture & playback over USB.  The page turn buttons could advance to the next note or rewind to the previous note in capture mode.  The motivation for revisiting this was the desire for a better arrangement of the Charlie Brown xmas theme.

    -------------------------------------------------------------------------------------------------------------------------------------------------------

    After a solid week on this, the reality of how hard it is became clear.  When transcribing a MIDI keypress into human readable notation, there's a nasty translation through the current cleff, current key, current octave, current accidental.  It's a lot harder than dragging & dropping notes.  That's why the cheapest music entry programs are just type setters.

    Some diabolical algorithms managed to compute shortpaw notation directly from MIDI pitch codes.  A cleff & key still must be provided.  The trick with development is to create a database of synthetic MIDI codes with a variety of test cases.  Automatic prediction of the accidentals is still a black art.  Finale never chose perfect accidentals either.  It basically adds sharps if the key is flat & adds flats if the key is sharp.  It adds naturals if the closest position on the staff is a natural.

    Being shortpaw notation, the accidentals currently reset for every note.  It still probably needs to indicate canceled accidentals to be readable.

    Of course, what we really want is the least number of accidentals.  This requires applying knowledge of all the preceding accidentals in the measure, a diabolical problem indeed.

    The general idea is the pitch code is going to be fixed & can only be set through MIDI capture.  The theory is the user would manually override the automatic accidentals by dragging & dropping a new accidental.  The indicated note would move up or down by the necessary amount for the new accidental to result in the same pitch that was captured.

    A few more days led to an algorithm which always draws accidentals if they're different than the key signature.  If they're the same as the key signature, it resets the accidental once.  The next step would be user defined accidentals.

    In this case, MIDI code E + a user defined flat caused it to draw F flat.  Then MIDI code F with no accidental made it draw a natural.  The idea is the MIDI code being fixed & only the notation changing based on the addition of an accidental.  The next step would be supporting a bass cleff, with the option to change the number of staffs at different times.

    The next big challenges were line wrapping, specifying keys, cleffs & suddenly even shortpaw notation was proving a pretty daunting task.

    ------------------------------------------------------------------------------------------------------------------------------------------------

    Lacking a keyboard, a lot of dragging & dropping is still going to be required for the multitude of required symbols.  Music construction set used a very slow process of dragging accidentals, dots, ties, rests, bars while the...

    Read more »

View all 7 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates