Let me start by copying my initial post from that discussion, since it gives context about what problems this is meant to solve. Then I'll follow up with a separate message giving thoughts about how it might be implemented.
I'd like to propose a change to the playback architecture. It
would enable a lot of new features, including several that have been
requested recently on other threads. I'll first describe it at a high
level, and then if people think it sounds like a good idea, I can go
into more detail about how it might work.
The idea is to create a new abstraction layer for translating score
elements into midi commands. It would let you have a file describing
the features and behavior of a particular synthesizer. Then you could
load that file and everything would "just work". In particular, it
would describe three aspects of playback: instruments, articulations,
MuseScore has built in support for GM soundfonts, so if you use one
it automatically selects the right sound for every instrument in the
Mixer. This feature would let it do the same for non-GM soundfonts. It
would also work for external midi synthesizers. Just load one file and
it configures everything for you, including sending Program Change
messages as needed.
Currently, MuseScore supports a very small set of articulations. For
example, stringed instruments have three: normal (whatever that means),
pizzicato, and tremolo. But most commercial orchestral synthesizers
have a lot of others: marcato, spiccato, detache, etc., each with or
without mutes, not to mention distinguishing between up bows, down bows,
and slurs. This would let you use those.
Lots of mechanisms can be used for selecting articulations. With GM,
each articulation is a different program, so you select them with
Program Change messages. But a lot of synthesizers use keyswitching,
where you press keys outside the normal range of the instrument to
select articulations. Or they may use a midi controller for that
purpose. Sometimes they use velocity to smoothly blend from legato to
marcato. Slurs can be indicated with the sustain pedal, or by setting
the velocity to 0, or in other ways. So the file needs to describe what
articulations are available, and also how to activate them.
Dynamics can also be indicated in multiple ways. GM synthesizers use
velocity for that purpose, but others use the expression controller, or
the mod wheel, or other mechanisms. And this can even vary between
instruments within a single synthesizer. So percussion instruments
might base dynamics on velocity, but stringed instruments might base it
on a controller.
So my idea is that you would load a file describing a set of
instruments, and the available articulations for each one, and the
mechanics of how to play them, and then MuseScore would take care of
everything for you. Does this sound like a good idea? If so, I've
thought a lot about how it might be implemented, and I can provide
details on my ideas.
Ok, here are my thoughts on a possible implementation. This isn't intended as a finished design, just a starting point for discussion.
Let's start with the file format for describing synthesizers. I propose using an XML based format. The root tag would be <Synthesizer type="midi">. Or alternatively, type="fluid" or type="zerberus". It could be extended in the future to support other technologies, like type="osc" or type="vst". The <Synthesizer> tag would contain some number of <Instrument> tags, and each of those could optionally contain some number of <Articulation> tags:
The instrument id must match the corresponding one in instruments.xml. For type="fluid" or type="zerberus", there's not a lot more than that needed. It just needs to specify the soundfont and sound for each one. So for the rest of this I'll focus on midi, which is more complicated.
Each <Instrument> and <Articulation> would contain an <Activate> tag and optionally a <Deactivate> tag. Those would specify the midi commands to send when switching to or from that instrument or articulation.
There would be several other tags defining the midi commands for performing certain actions: <SetDynamics>, <BeginSlur>, <EndSlur>, and possibly others. Those could appear at any level of the file. If the current articulation specifies it, that gets used. If not, it falls back to the instrument's definition. And if that doesn't define it either, it falls back to the definition in the root <Synthesizer> tag.
Next let's consider how those tags define midi commands. Each one contains a list of commands. Those would include all the standard things you'd expect: <NoteOn>, <NoteOff>, <ControlChange>, <ProgramChange>, etc. There would also be tags for <NoteOnVelocity> and <NoteOffVelocity>. Those don't send any commands themselves, but they set the velocity to be used for later notes.
For example, an instrument that bases dynamics on velocity would specify
The file would also need a way to specify percussion kits. That could be done with a <Kit> tag. It would contain many of the same tags as <Instrument>: <Activate>, <SetDynamics>, etc. It would also contain a list of the instruments contained in the kit, and the notes for each one.
Ok, now let's consider how the user interface for this would work. The user would be able to load any number of these files. I'll call them "synthesizer configurations" (not necessarily a great name, but it will do for the moment). It will clearly require minor changes to several windows, including Synthesizer, Pianoroll Editor, and Staff Text Properties. But the most important changes will be to the Mixer window.
Currently, it includes a menu for each instrument where you can select any sound from any loaded soundfont. I suggest a simple change to those menus: add entries (at the very top, separated from the rest of the list by a divider, selected by default) listing all loaded configurations that include that instrument. You could still override it by choosing a different sound, but by default it would play that instrument in whatever way was defined by the configuration.
This design would also involve a larger change. Currently, the mixer has a separate entry for every articulation of every instrument, and takes up a separate midi channel for each one. Clearly that can't scale to large numbers of articulations, and even with the current limited set it's problematic. Create a standard five part string section, and you've already used up 15 of the 16 available midi channels! So I would change it to have only a single line and use only a single midi channel for each instrument. Changing between articulations would be done however it's specified in the configuration file. I think this is a much better approach, but it does create issues for backward compatibility. I don't know whether that's considered a problem or not.
All the synths you then go on to list use MIDI as their communication protocol.
Currently MuseScore uses two native synths - FluidSynth and Zerberus which are SF2 and SFZ respectively.
I would suggest that you use the soundfont format as the type - eg <Synthesiser type="sf2"> or <Synthesiser type="sfz">. If you really wanted to you could then go on to have a tag describing it in more detail, such as <Synthesiser type="sf2", engine="FluidSynth">.
We also need to agree on the spelling of the word Synthesiser which is spelt differently by Americans compared to the rest of the English speaking world.
The other thing you need to be aware of is that articulation isn't necessarily controlled by Program Change messages in a GM compatible SF2. The only reason why MuseScore currently uses this system in string and Trumpet parts is because of the existence of the pizzicato strings, tremolo strings, and muted trumpet GM patches. These are the only articulations available in a GM soundfont, although a GM2 soundfont may have others using the Bank Select system.
It is perfectly possible to produce an SF2 soundfont which controls articulation by means of velocity switching in the way that Garritan Personal Orchestra does for example, using fixed velocities for sample playback and the expression controller for dyhamics in the relevant instruments. In fact I was looking into that very thing yesterday. It is much easier to do in an SFZ soundfont though.
(GPO btw is SFZ based, and if you know what you're doing it is possible to go in and customise it to your requirements.)
Finally, you should be aware that MuseScore tends not to use Note Off messages, using Note Ons with a velocity of 0 instead in order to maintain Running Status, so including Note Offs as part of your specification is unwise.
You should also look at the Articulations already present in the Instruments.xml file, which already present a way of defining many of the articulations used by strings. IIRC these can be customised as part of an instrument's definition in Instruments.xml.
I hope this helps with your spec design.
PS I fully support changes to the mixer dialogue - there are already two proposals in the issue tracker for doing this that I know about - here is the one I initiated....
https://musescore.org/en/node/63666 This also links to the other.
> I'd like to propose a change to the playback architecture. It would
> enable a lot of new features, including several that have been
> requested recently on other threads. I'll first describe it at a high
> level, and then if people think it sounds like a good idea, I can go
> into more detail about how it might work.
> The idea is to create a new abstraction layer for translating score
> elements into midi commands. It would let you have a file describing
> the features and behavior of a particular synthesizer. Then you could
> load that file and everything would "just work". In particular, it
> would describe three aspects of playback: instruments, articulations,
> and dynamics.
Have you noticed how this is beginning to sound a little bit like a
"No man is an island, entire of itself; every man is a piece of the
continent, a part of the main.
If a clod be washed away by the sea, Europe is the less, as well as
if a promontory were, as well as if a manor of thy friend's or of
thine own were."
-- John Donne, Meditation XVII
> I would suggest that you use the soundfont format as the type - eg <Synthesiser type="sf2"> or <Synthesiser type="sfz">.
I'm not sure which approach is better. That depends on how the synthesizers are implemented, and I'm not at all familiar with the MuseScore source code. (I've skimmed it a bit, but that's all.)
I was figuring there would be an abstract class called SynthesizerConfiguration, or something like that, that would define an API for playing notes, setting dynamics, etc. That API would not be midi-specific, so it would work entirely in terms of musical events ("set dynamics") rather than technology-specific ones ("set controller x to value y"). There would be various subclasses for particular synthesizers or technologies, and the "type" tag would determine which subclass was instantiated.
So the question is what subclasses would need to exist. Could there be a single subclass that works equally well for any sfz based synthesizer? Or would there need to be a Zerberus-specific subclass? From what I've seen, it would be the latter. For example, there's no generic mechanism for telling an sfz based synth to load a particular file. The loadInstrument() method is specific to Zerberus.
> We also need to agree on the spelling of the word Synthesiser which is spelt differently by Americans compared to the rest of the English speaking world.
As far as the code is concerned, it looks like that decision has already been made:
I'm not sure which approach is better. That depends on how the synthesizers are implemented, and I'm not at all familiar with the MuseScore source code. (I've skimmed it a bit, but that's all.)
Well I can tell you that at synthesiser level both Zerberus and FluidSynth are using MIDI as the bottom level communication protocol, but I am not familiar with the higher level source code either.
SFZ synths currently fall into 3 types - SFZ 1.0, SFZ 2.0 and Plogue - and then there is Zerberus which currently only recognises a subset of SFZ 2.0 opcodes, although I think we have a GSOC student working on inmproving that.
The problem you are going to find is that an SFZ 1.0 synth will not recognise many of the SFZ 2.0 opcodes, and vice versa. Plogue SFZ is mainly 2.0 compliant, but it has its own series of opcodes which aren't recognised by either 1.0 or 2.0 synths.
Really the best thing to do is let the user manage their SFZ player (apart from Zerberus of course which has its own UI within MuseScore). I am currently using 3 different SFZ players, depending on the nature of the SFZ file being loaded. The problem being that the SFZ format is so open that it is easy to add customised opcodes to work with your particular player implementation, which is why, of course, many VST instruments have SFZ at their core.
SF2 is well defined and standard. The soundfont will either be 2.01 (with 16 bit samples) or 2.04 (with 24 bit samples), and FluidSynth is a mature and well established player (although MuseScore's version of it is currently quite a long way behind the mainstream version.)
Yeah, that's about what I figured. And really, it shouldn't matter to MuseScore what file format a particular synth uses, or even what synthesis method it uses. What matters is how to control that synth. What interface should it use, and what sequence of commands should it send with that interface?
Does anyone else have thoughts about this? I was hoping the core developers might chime in with information about what would be involved in implementing it.
Are there any freely available soundfonts (SF2 or SFZ) that actually do any of this (provide multiple samples for all these different articulations)? If so, can you point to some documentation on how they are set up? Or maybe there are standards for this?
Yeah, that's about what I figured. And really, it shouldn't matter to
MuseScore what file format a particular synth uses, or even what synthesis
method it uses. What matters is how to control that synth. What interface
should it use, and what sequence of commands should it send with that
Does anyone else have thoughts about this? I was hoping the core developers
might chime in with information about what would be involved in implementing
I often use Sonatina Symphonic Orchestra, which also includes a staccato version of many instruments. But for really large sets of articulations, you need to go to commercial packages. See, for example,
I realize I may have been unclear on one point. In all of the above, when I talk about "MIDI" I'm really talking about "external MIDI devices". Ones where you just specify a port and channel, and messages get sent there, and what happens then is completely out of MuseScore's control. Both Fluid and Zerberus also use MIDI as a communication protocol, but MuseScore uses a different method to send messages to them, and it can control them in other ways that aren't possible with external devices (like telling them to load a particular file).
It's conceivable that someday MuseScore could support interfaces that don't involve MIDI at all, like OSC. I've tried to make sure this design will have no trouble accommodating that, if it should happen.
Are you acquainted with the work done by Maxim Grishin (igevorse) on MuseScore, with respect to MIDI output, during GSoC2014  and GSoC2015 ?
Although I couldn't help directly with the implementation, I was able to discuss some ideas with him, regarding a full MIDI output implementation and, as far as I know, he solved the external MIDI devices issue on one of his MuseScore pull requests (see ).
But his improvements never made it to the official 2.0.0 release, because of changes in the MuseScore file structure (he had to start rewriting part of his improved code from scratch, I think). Last summer, we talked about one or two new ideas of mine by mail, and I expected some feedback from him on those ideas, but he stopped answering at a point.
Also, since that time, there are no new entries in his blog  regarding the follow-up on his work on the MuseScore MIDI output features. It's a pity, because he got things on a good path.
In my opinion, this might be the time for someone, with good expertise in C++/Qt (not my case, unfortunately), to pick up where he left and continue the work on his pull requests. Would you be interested?
I just checked, and his pull for "Assigning MIDI port/channel to instruments" was merged with master branch on 28 Oct 2015 and these features are already available (all of them?) on release 2.0.3 (I was using 2.0.2, so I updated to 2.0.3 and saw the functional MIDI Input option in Edit->Preferences->Input/Output).
Nonetheless, the other pull, "Listening to MIDI Machine Control messages", is still labeled as "work in progress". So, even though things are a somewhat more advanced then I stated in the previous post, these may still be a good starting point for your ideas, I think.
I saw his blog posts, and I'm really sorry his MIDI Actions feature never got merged. There's some overlap with this feature in the sense that they both send out MIDI commands, but mostly I see them as independent. I'd like the ability to insert arbitrary MIDI commands anywhere I want in the score, but I also don't want to *have* to do that just to make basic things like dynamics work correctly. So they're useful features that solve different problems. Of the two, this feature is the one I care about more.
I'm certainly willing to help implement this, though I can't commit to doing the whole thing myself. Like everyone, I have limited time! Also, I'm very nervous about stomping around in the heart of a codebase I don't understand very well and probably breaking things in the process. :) I'd rather write things that are self contained, or at least fairly peripheral, and then let someone who knows the code much better integrate them.
For example, it looks like the job of playing a score is controlled by the functions in rendermidi.cpp. They look through all the elements in the score and generate a series of NPlayEvents. And NPlayEvent is a subclass of MidiCoreEvent, so the assumption that everything is based on MIDI is already inherent in the event hierarchy! That clearly would need to be changed, and it should probably be done by someone who thoroughly understands the current design.
So here's my best attempt at a plan for how to implement this:
1. Create an interface for the new abstraction layer.
2. Create concrete subclasses of it for the three supported types (Fluid, Zerberus, external MIDI).
3. Write the code for instantiating those subclasses based on an XML file.
4. Modify all the playback code to go through the new abstraction layer, rather than calling the synthesizer directly.
5. Change various parts of the UI as needed to support this feature.
6. Probably a lot of other stuff I don't know about. The new system for articulations will presumably involve some architecture changes, but I don't know what. And it will affect the format for score files, since they record the mixer configuration. And I don't know what else, but maybe a lot.
I'm pretty confident I can do parts 1-3. Beyond that, I can't commit.
I'm still unclear as to why you think that Fluid and Zerberus need to be treated differently from other synths.
The only difference between them and external synths is that there is a file handling UI directly in MuseScore, so you load soundfonts into them from MuseScore. In all other respects they use MIDI messages to produce music, just like any other synth.
I suspect you in danger of over-complicating things which would result in sluggish, unwieldy code.
For greatest efficiency it would be better to define articulations, and instrument behaviours in terms of MIDI events, which can then be sent directly to the synths concerned, rather than have a structure above the MIDI core which has to be then translated into MIDI messages.
There are a few reasons for treating them differently.
First, the built in synthesizers have unique abilities that aren't available for external synthesizers. Most importantly, we can directly instruct them to load a soundfont. That isn't possible with external synthesizers, which may not even use soundfonts.
Second, the way you configure the communication is very different for them. With an external synthesizer, data gets sent through JACK, and you need to specify the port(s) and channel(s) to send data to. With internal synthesizers, that isn't necessary. The data just gets passed to the synthesizer directly through its own API. MuseScore can define ports and channels however it wants, since they don't need to correspond to any external device.
The different classes will still of course share a lot of code. That's what subclassing is for. But they'll also have differences.
Also remember that while the currently supported synths are limited to MIDI, we may eventually want to go beyond that. For example, VST3 has its own event system that's more powerful than MIDI events.
> I suspect you in danger of over-complicating things which would result in sluggish, unwieldy code.
Trust me, I don't write sluggish, unwieldy code. :) If you want to look at some of my code, here's what I do for my day job: http://openmm.org
I really need to know whether and how to move forward with this. I think it would be a useful feature. It's certainly one that I would like to have! But I haven't gotten a lot of encouragement, much less advice on how to design/implement it. I'm not sure how to interpret that.
I don't really understand how the MuseScore development community works. Is there someone who's in charge? Who are the core developers? How do decisions get made?
Perhaps this proposal doesn't fit with their vision for the product. If so, that's fine. I totally understand. I just need to know! I don't want to put a lot of work into this, only to find out that it will never be used. On the other hand, if I am going to move ahead with this, I need to discuss the design and implementation with someone who knows the code much better than I do.
I'm just not sure where things stand, or what I should be doing now. I need clarification.
I understand where Peter is getting at, and I think I may be able to clarify things a little bit more, so that MuseScore core developers, should they see fit, may think thoroughly about his implementation.
What Peter wishes to implement is something called “soundset file”, or “dictionary file”. I know that commercial notation programs rely on these files to communicate with external devices (hardware/software synthesizers and samplers). When such devices are GM/GS/XG compatible, those files allow notation programs to tell the devices which program/bank to use and what MIDI CC messages they should listen to. This is essentially what “instruments.xml” is used for in MuseScore, regarding fluidsynth and Zerberus, right?
If MuseScore had MIDI actions already implemented, a user could write in the score (maybe as hidden objects) a series of incremental CC7, or CC11, values above a crescendo sign, and fluidsynth would read them accordingly and raise the sound volume for that instrument, thus creating the crescendo effect. How does fluidsynth know what CC7, or CC11 mean in terms of MIDI protocol? Because “instruments.xml” tells it what they are, acting as sound configuration (“soundset”) file.
If playback devices are strictly GM/GS/XG compatible, than, as long as the software allows the user to directly input MIDI Messages (Actions) on the score, and the user knows the effect each MIDI CC has in playback, it is possible to achieve a greater degree of realism in playback, by using the notation software as if it was a sequencer (without it actually being one), regarding the possibility of sending MIDI CC messages to the playback devices to affect the sound. This is the approach that igevorse was trying to implement with his MIDI Action code.
But when it comes to sample libraries and playback devices not GM/GS/XG compatible, they don’t recognize Program Changes (MIDI CC0) and don’t apply MIDI CCs in the same way as GM/GS/XG compatible ones. And there isn’t a standard for their use of MIDI CCs. With respect to sample libraries, each vendor/creator attributes functionalities to the various MIDI CCs according to their conveniences — although they might try not to alter too much the functionalities of the most common CCs (CC7, CC10, CC64, for instance). That’s way it is not possible to have a single “soundset file” for all sample libraries and external synthesizers.
Now imagine a scenario where a MuseScore user employs 16 different third party libraries/synthesizers on 16 different tracks and wishes to input some MIDI Actions on the score (if this was already possible in MuseScore) to control playback. He would have to remember all the different Program Changes, Keyswitches and/or CC attributions for each library as he was writing those actions on the score, according to his needs. That’s where Peter’s approach, regarding the use of separate and more elaborate “soundset files”, would come in handy.
If there was a separate “soundset file” for each library, describing how the various Program Changes, Keyswitches and/or CC attributions operated for that specific library, and the different articulations the library’s instrument patches allow, the user would just need to introduce text elements like “pizz.”, “arco”, “spicc.”, “col legno”, “cresc.”, or “dim.” and those would be translated by MuseScore — according to the information contained within the “soundset file” specified for that track — into a stream of MIDI Messages sent to the appropriate playback device loaded with the library, along with other regular MIDI Events needed for playback. Thus, the playback action of each library could be controlled by a separate “soundset file”, without the need for the user to tediously input numerous Keyswitches and CCs on the score, by hand, for each track with a different library.
I guess that Peter’s idea of an abstraction layer relates to the possibility of allowing the future implementation of different communication protocols (like OSC), other than MIDI, to control playback devices, and also the possibility of addressing all playback devices, including the internal ones (fluidsynth and Zerberus) using a single and common function within the MuseScore’s structure, a function that reads each soundset file and translates them into a stream of MIDI Messages (correct me if I’m wrong here). In this regard, I would suggest that the root tag for the XML structure of the "soundset file" should address the protocol type (MIDI, osc, vst) only.
Regarding the question of VSTi, it isn't being asked to turn MuseScore into a VSTi host, at this point. There are freely available software able to do that (in Windows, Mac and Linux [both with and without using wine]), allowing to use a VST plugin/sampler as a standalone external (MIDI) device. What is being asked is to allow the communication with such external devices (via Jack MIDI, I suppose), to control/manipulate a specific software/hardware synthesizer, or a specific sample library loaded into that plugin/sampler. Regarding the latter, as I mentioned above, the idea isn’t to control the sampler itself, but rather to tell the sampler how to control the library loaded into it: which samples to activate, what control changes to perform, in which MIDI port/channel or instrument patch, etc.
Finally with respect sample libraries and articulations, to my knowledge, there aren't — yet(!) — any freely available sample libraries using multiple articulations (other than the usual sustain/staccato/pizzicato ones), but there are some SFZ (open format) based commercial libraries, like GPO4/5, which allow the user to access the SFZ instrument definition files, which define how the samples for a particular instrument patch should be played. In such libraries, the licensing for samples is still restricted (no modification, duplication, releasing, etc.), but the instrument definition settings may be changed by the user, as long as he knows what he's doing. So, those files can be viewed, analyzed, studied, changed and saved. Of course, Zerberus isn't able to load such libraries, but MuseScore should be able to communicate with external samplers loaded with commercial or freeware sample libraries alike.