much more work. eSpeak can convert text to phonetic symbols for many
>
>> Making the user input phonetic symbols instead of actual lyrics is
>> not a solution.
>
> Sorry, I didn't mean to propose that. I just wanted to note that a fallback
> that allowed phonetic symbols would be necessary.
>
> As to the rest, my (unofficial) thought is that it currently takes quite a
> bit of manual intervention to get English working well with the UTAU
> toolchain, whether it uses VCV or CVVC. And each approach requires a
> different set of tools to connect the samples together. It seems to me that
> there's quite a bit of risk of not coming out with something usable at the
> end.
>
> -- David
>
>
>
>
> On Tue, Mar 22, 2016 at 7:58 AM, syrma <
[hidden email]> wrote:
>
>> Thank you for your reply.
>>
>> As for the playback, I also think that singing each note the moment we put
>> it is impossible; we need to set the lyrics, and even then, the synthesis
>> takes time. But getting it to play like in Cadencii would probably be good;
>> to press play once everything is set, that is. Cadencii takes a while to do
>> that though, and at some point, the time spent waiting for the synthesis is
>> probably multiple times the time spent actually editing (that being said I
>> think a lot of optimization is possible on Cadencii so it's probably not
>> the
>> best example).
>>
>> Leaving the questions about dictionaries for later, a side note about my
>> struggles with v.Connect-STAND, Cadencii's synthesizing engine. I have
>> finally been able to get some results out of it (by switching between my
>> Linux and Windows every time one gets a problem). The rendering is more
>> than
>> decent in my opinion (although it depends a lot on the settings and the
>> used
>> voicebank, and it could sound worse than e-cantorix if not used properly
>> (okay, not that bad, but still) ), and I think it is an interesting tool to
>> use overall (some Utau users import their Utaus to v.Connect-stand to get a
>> better rendering, but it is sometimes a little tricky). However, there are
>> a
>> few points that hinders direct use:
>>
>> - The Windows binaries won't work unless the system is as Japanese as
>> possible, and while I don't know what is causing this yet (because I am not
>> used to compiling on Windows), this needs a fix.
>> - Encoding auto-detection is probably needed; even my Linux-built version
>> needs a default input encoded as shift-jis (the typical encoding when
>> dealing with files created by Japanese users on Windows). It supports other
>> encodings, but the user needs to specify them.
>> - The software takes a meta text sequence file (its own format), and
>> outputs
>> an audio. While I think implementing a conversion from a score to a meta
>> text sequence would be sufficient for the first part of the project
>> (generating the audio), optionally, I believe an optimization might be
>> possible. As v.Connect's based on World (which implements real-time singing
>> synthesis according to their introduction page), I am wondering whether
>> changing the code to intercept the parameters before the audio is
>> generated,
>> and playing it in real-time would be possible. I have not dived into
>> v.Connect's code far enough, so if someone who did thinks I am going a
>> wrong
>> and completely impossible way, please do let me know.
>>
>> A very interesting point in it however is its ability to convert and use
>> Utau voicebanks, with the great amount of downlodable utaus on the net
>> (let's forget for now about the mass of problems that alone causes). While
>> looking for the possibility of using English with Utau voices, I came,
>> among
>> others, across this page :
http://utau.wiki/cv-vc (see also:
>> utau.wiki/tutorials:cvvc-english-tutorial-by-mystsaphyr ). This seems to be
>> popular enough that a lot of utauloids use this method to simulate
>> non-Japanese pronunciation. Namine Ritsu, a free voice for v.Connect-stand
>> (and the most popular one), also has recordings of this kind, although the
>> way English is rendered far from being perfect, and accents are all for the
>> user to simulate. There are also (non-open source) plugins that can convert
>> lyrics (or rather sequence files) from CVVC to VCV (another style used in
>> utaus). Even though this allows for the user to get and add their sets of
>> voice from the internet, I can easily think of a few issues one can come
>> across:
>>
>> - Making the user input phonetic symbols instead of actual lyrics is not a
>> solution. I think it may be possible to convert lyrics to espeak phonemes,
>> and implement the remaining conversion step (that would depend on the
>> voice). That gets us to another set of problems; the user would need to
>> supply both the word and the hyphenation. And even then, some other
>> problems
>> are bound to happen, either because the word isn't in the dictionary or
>> because the sound isn't available. In the first case, the user may need to
>> provide the pronunciation (a proper noun for example). Beside this, should
>> we let the user modify the pronunciation they want (after it is
>> automatically generated) to simulate an accent or to make something sound
>> more natural?
>>
>> - Encoding problems, always. Japanese on Windows is unpredictably tricky to
>> deal with.
>>
>> - Voicebanks are usually recorded for a precise language. I could be wrong,
>> but for now I don't see how we could detect the language unless the user
>> specifies it. Also, some of the Japanese are only compatible with either
>> romaji or kana (we could use kakasi to convert either the lyrics or the
>> voicebank).
>>
>> Anyways, I don't think any amount of work of one summer would be enough to
>> even think about all the issues (everything is so much more complicated
>> than
>> it first seems). The question would be, how much would make an acceptable
>> project?
>>
>> The project I have in mind for now would be something like the following:
>>
>> - As a first step, taking care of the usability issues of v.Connect-stand,
>> or ideally turning it into a usable library.
>> - Implementing the generation of meta text sequences (it would be
>> interesting to see how Cadencii, the open source C++/Qt editor, does it).
>> This should include the processing of whatever settings we have (including
>> phonemes) as this kind of files should provide all the information needed
>> for synthesis.
>> - Making a MuseScore plugin out of the two aforementioned items. This would
>> include in addition:
>> - the front-end (collecting settings)
>> - the playback function
>>
>> Though I don't know if this is relevant to the current discussion (or at
>> all), while looking for a good free voice data, I found Namine Ritsu's
>> license is very unclear to me (the site the wiki pages link to for the use
>> terms doesn't exist anymore). There is a separation between the character
>> (visual art, profile, ...) and the voice resources. I suspect from the
>> contradicting official information that it has changed over the time. The
>> character itself seems to be the property of canon, but there doesn't seem
>> to be any restrictions over the use of the voices. In addition, this
>> voicebank (
http://hal-the-cat.music.coocan.jp/ritsu_e.html) says it is
>> released under the terms of GPLv3. I assume at least this voicebank is safe
>> enough.
>> [Unclear official material :
>> -
http://www.canon-voice.com/english/kiyaku.html (the English says
>> something
>> very unclear about the character but the voice is free)
>> -
http://canon-voice.com/ritsu.html ]
>>
>> So immediate questions are:
>> - Is this a realistic and/or an acceptable project?
>> - I am not aware of MuseScore plugin rules, so is such an approach alright?
>> If not, what is the better way?
>> - I am not sure where to integrate the second part, but I think the part to
>> integrate into MuseScore should be as general as possible to add gradually
>> support for other tools.
>>
>> Sorry for the long post. Please let me know your opinion, and whether I am
>> analyzing things wrong!
>>
>>
>>
>> --
>> View this message in context:
>>
http://dev-list.musescore.org/GSOC-2016-Regarding-the-Virtual-Singer-project-idea-tp7579698p7579737.html>> Sent from the MuseScore Developer mailing list archive at Nabble.com.
>>
>>
>> ------------------------------------------------------------------------------
>> Transform Data into Opportunity.
>> Accelerate data analysis in your applications with
>> Intel Data Analytics Acceleration Library.
>> Click to learn more.
>>
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140>> _______________________________________________
>> Mscore-developer mailing list
>>
[hidden email]
>>
https://lists.sourceforge.net/lists/listinfo/mscore-developer>>
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
>
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140>
>
>
> _______________________________________________
> Mscore-developer mailing list
>
[hidden email]
>
https://lists.sourceforge.net/lists/listinfo/mscore-developer>