首页 \ 问答 \ Javascript - 具有音高和持续时间控制的文本到语音(Javascript - text to speech with pitch and duration control)

Javascript - 具有音高和持续时间控制的文本到语音(Javascript - text to speech with pitch and duration control)

我一直在寻找让我的javascript程序唱歌。

我首先看了网络语音api,但音调控制看起来非常有限,所以我想也许有办法将结果发送到网络音频节点,并从那里应用效果,但似乎不可能。

我找到了mespeak.js库: http//www.masswerk.at/mespeak/

它允许返回我将视为音频节点源的音频缓冲区,从而实现更多控制。

我的输入是音符序列,具有频率和持续时间。 就像是 :

var seq = [[440hz,1000ms],[880hz,500ms],...];

我设法从这个序列和一系列单词到我的程序,用不同频率的节奏说出这些单词

但是我遇到了一些问题。

  • Web音频失谐链接到playbackRate,因此投球将改变单词持续时间。
  • 我似乎只能使用detune或playbackRate值进行播放,这不允许我输入频率(ex 440)并将结果保存在该频率中。 我唯一能做的就是近似语音的基频,计算与预期音符和音高的差异,但效果不是很好。
  • 我对如何处理持续时间毫无头绪。 playbackRate是人为改变单词持续时间的唯一方法吗? 如何强迫我的话语延伸到特定的持续时间?

如果你们有任何经验,我会很感激任何投入。

非常感谢

编辑:添加一些代码

function sing(text,note,duration){
  var buffer = meSpeak.speak(text,{rawdata:'default'});
  playSound(buffer,freqToCents(note),duration)
}

function freqToCents(freq){
  var root = 440 //no idea what is the base frequency of the speech generator
  return 3986*Math.log10(freq/440)
}

function playSound(streamBuffer, cents, duration, callback) { 
  var source = context.createBufferSource();
  source.connect(compressor);

  context.decodeAudioData(streamBuffer, function(audioData) { 
    var duration = audioData.duration; 
    var delay = (duration)? Math.ceil(duration * 1000) : 1000;
    setTimeout(callback, delay);
    source.buffer = audioData;
    source.detune.value = cents; 

    source.start(0);
  }, function(error) { }); 
}

我的音序器正在工作,并且在每个步骤中,如果需要,调用唱歌功能,例如:

sing('test', 440, 1000)

正如我所说,我希望持续时间参数影响结果


I've been looking at making my javascript program sing.

I first looked at the web speech api, but the pitch control seems very limited, so I thought maybe there is a way to send the result to a web audio node, and apply effects from there but it doesn't seems possible.

I found the mespeak.js library: http://www.masswerk.at/mespeak/

It allow to return a audio buffer that I'll treat as a source of my audio nodes, allowing for more control.

My input is a notes sequence, with a frequency and duration. Something like :

var seq = [[440hz,1000ms],[880hz,500ms],...];

I managed to get from this sequence and a series of words to my program saying those words in rhythm with different frequencies

But I'm having a few problems.

  • The web audio detune is linked to the playbackRate, so pitching will change word duration.
  • I only seem to be able to play with the detune or playbackRate values, which does not allow me to input a frequency (ex 440) and have the result in that frequency. The only thing I can do is approximate the base frequency of my speech, compute the difference to the expected note and pitch accordingly, but the result is not very good.
  • I am clueless about how to deal with duration. Is the playbackRate the only way of artificially changing the duration of a word? How can I force my word to spread to a specific duration?

If any of you had any experience with that sort of stuff I'd appreciate any input.

Thanks a lot

EDIT: add some code

function sing(text,note,duration){
  var buffer = meSpeak.speak(text,{rawdata:'default'});
  playSound(buffer,freqToCents(note),duration)
}

function freqToCents(freq){
  var root = 440 //no idea what is the base frequency of the speech generator
  return 3986*Math.log10(freq/440)
}

function playSound(streamBuffer, cents, duration, callback) { 
  var source = context.createBufferSource();
  source.connect(compressor);

  context.decodeAudioData(streamBuffer, function(audioData) { 
    var duration = audioData.duration; 
    var delay = (duration)? Math.ceil(duration * 1000) : 1000;
    setTimeout(callback, delay);
    source.buffer = audioData;
    source.detune.value = cents; 

    source.start(0);
  }, function(error) { }); 
}

My sequencer is working, and at each step, calls the sings function if necessary, for example like this:

sing('test', 440, 1000)

As I was saying, I'd like the duration parameter to impact the result


原文:https://stackoverflow.com/questions/36891851
更新时间:2019-12-07 11:12

最满意答案

Espeak 支持SSML模式 ,您需要使用它来修改参数而不是尝试后处理结果。

您需要先使用espeak,然后尝试在javascript端口中重现相同的结果。 它尚不支持,但在mespeak.js的这部分中

  '-w', 'wav.wav',
      '-a', (typeof args.amplitude !== 'undefined')? String(args.amplitude) : (typeof args.a !== 'undefined')? String(args.a) : '1
      '-g', (typeof args.wordgap !== 'undefined')? String(args.wordgap) : (typeof args.g !== 'undefined')? String(args.g) : '0',
      '-p', (typeof args.pitch !== 'undefined')? String(args.pitch) : (typeof args.p !== 'undefined')? String(args.p) : '50',
      '-s', (typeof args.speed !== 'undefined')? String(args.speed) : (typeof args.s !== 'undefined')? String(args.s) : '175',

您需要添加-m选项以启用SSML。


Espeak supports SSML mode, you need to use it to modify parameters instead of trying to postprocess results.

You need to play with espeak first and then try to reproduce same results in javascript port. It is not supported yet, but in this part in mespeak.js

  '-w', 'wav.wav',
      '-a', (typeof args.amplitude !== 'undefined')? String(args.amplitude) : (typeof args.a !== 'undefined')? String(args.a) : '1
      '-g', (typeof args.wordgap !== 'undefined')? String(args.wordgap) : (typeof args.g !== 'undefined')? String(args.g) : '0',
      '-p', (typeof args.pitch !== 'undefined')? String(args.pitch) : (typeof args.p !== 'undefined')? String(args.p) : '50',
      '-s', (typeof args.speed !== 'undefined')? String(args.speed) : (typeof args.s !== 'undefined')? String(args.s) : '175',

You need to add -m option to enable SSML.

2016-04-28

相关问答

更多

我是否需要此应用程序的数据库?(Do I need a database for this application?)

你可以做一个像这样的桌子 ID 名称 PARENT_ID 此结构将允许嵌套类别 然后,您可以创建一个与类别和数据点相关的表。 You could make a table like this id name parent_id This structure would allow for nested categories You could then make a table that relates category and data points.

基于伪码的递归关系(时间复杂度)(Recurrence Relation based off Pseudo Code (Time complexity))

您的递归关系应为T(n)= 2T(n-1)+ O(1),其中T(1)= O(1)。 然而,这并没有改变渐近线,解决方案仍然是T(n)= O(2 ^ n)。 要看到这个,你可以扩展递归关系得到T(n)= O(1)+ 2(O(1)+ 2(O(1)+ ...))所以你有T(n)= O( 1)*(1 + 2 + 4 = ...... + 2 ^ n)= O(1)*(2 ^(n + 1)-1)= O(2 ^ n)。 Your recurrence relation should be T(n) = 2T(

(mx.core.UITextField是flash.text.TextField)返回false。((mx.core.UITextField is flash.text.TextField) returns false. How is that possible?)

您要求使用flash.utils.getQualifiedClassName(obj)的类名。 UITextField类扩展了flash.text.TextField类。 因此,该类是UITextField而不是TextField。 这样想吧 class TextField { } class UITextField extends TextField { } 您需要使用getQualifiedSuperclassName()来获取TextField Well as it appears,

Cdbcriteria-> With,Many to Many关系和方法Findall()。(Cdbcriteria->With, Many To Many Relation And Method Findall(). Strange Behavior)

似乎在结果中您只有4个操作的记录。 每个操作都有2个角色。 因此,对于类似ORM的解决方案来说,这是一种正确的行为。 Operations::model()->findAll返回4个操作。 如果要读取角色,则需要嵌套循环: foreach($roles_and_actions as $key=>$one_record) { echo $key." "." ".$one_record->id_oper; foreach ($one_record->roles as $role)

仅在登录成功时显示烤面包机通知(Show toaster notification only when login is successfull)

所以我假设会有一些登陆页面在成功登录后会被重定向。 例如,从login.aspx到default.aspx 所以将它设为default.aspx?firsttime = yes 这样,只有当你第一次点击default.aspx时,你才有了额外的查询字符串。 您可以将其放在default.aspx中并检查查询字符串,而不是将代码放在母版页中。 例如: function getQueryVariable(variable) { var query = window.location.sear

子域和域路由(subdomain and domain routing)

也许问题与铁轨无关...... 希望以下链接有所帮助 http://www.boutell.com/newfaq/creating/withoutwww.html 干杯 With a small code change subdomain-fu can also route based on domain and host. This blog post explains it in more detail.

如何使用C ++和Snow Leopard使OpenGL / GLUT使用Eclipse IDE(cocoa 64位)(How do I get OpenGL/GLUT working with Eclipse IDE (cocoa 64 bit) using C++ and on Snow Leopard)

请参阅我在OS X上的OpenGL和Eclipse中的GLUT答案 Please see my answer at OpenGL and GLUT in Eclipse on OS X

LinQ - 解析HTML页面的多个值(LinQ - Parsing multiple values of HTML Page)

您可以选择具有div子项的h3标签,其中包含lib_presta和prix_presta id(如果所有这些标签都包含酒店数据,则只需选择//h3标签): var xpath = "//h3[div[@id='lib_presta'] and div[@id='prix_presta']]"; var hotels = from h3 in doc.DocumentNode.SelectNodes(xpath) select new H

相关文章

更多

最新问答

更多
  • jsPlumb draggable element javascript函数(jsPlumb draggable element javascript function)
  • MVC4:ViewModel(带有radiobuttonlist)在HttpPost之后为空(MVC4: ViewModel (with radiobuttonlist) is empty after HttpPost)
  • 如何在同一帐户上设置“Dev repo”(在prod和团队之间)(How to set up a “Dev repo” (between the prod and the team) on the same account)
  • 如何在tcl中将eth0配置为发送方udp端口(how to configure eth0 as a sender udp port in tcl)
  • 在main方法中进行更改后传递给构造函数的TreeMap实例的行为是什么(What is the behavior of a TreeMap instance passed into a constructor following changes in main method)
  • 在这个的一些属性不同的颜色(Different color in some properties of this)
  • CURL没有返回任何内容(CURL Not returning anything)
  • MVC5注入依赖于城堡windsor的视图(MVC5 injecting dependency on a view with castle windsor)
  • CakePHP AJAX-Call:发生错误:未定义(CakePHP AJAX-Call: An error occured: undefined)
  • 如何用.aidl扩展名创建文件?(how to create a file with .aidl extension?)
  • 无法获得全尺寸动态WordPress灯箱图像显示(Cannot get full size dynamic WordPress lightbox image to display)
  • 使用Jsoup删除元素不起作用(Removing Element with Jsoup doesn't work)
  • 交叉编译ARM的MongoDB C ++驱动程序。(Cross compiling MongoDB C++ driver for ARM. Cannot find Boost Libraries)
  • ProgressDialog没有关闭(ProgressDialog not closing)
  • 单元测试DB中的JPA插入(Unit testing JPA insertion in the DB)
  • 点击谷歌服务对话框中的更新按钮(Click on update button in check google service dialog)
  • 适用于PhoneGap Build的iOS“缺少必需的57x57图标”(“Missing required 57x57 icon” for iOS on PhoneGap Build)
  • MVC ckeditor post编辑器值内联(MVC ckeditor post editor value inline)
  • Angular 2 Pipe - 无法读取未定义的属性'toString'(Angular 2 Pipe - Cannot read property 'toString' of undefined)
  • 安装引导加载程序(Installing bootloaders)
  • 用于确定最终如何包含头文件的工具(Tool to figure out how a header file is eventually being included)
  • 在将字符串转换为int [duplicate]之前检查字符串是否不是数字(Check if string is not a number before converting it to a int [duplicate])
  • 从脚本构建db2数据库以进行构建验证的工具?(Tools to build db2 database from scripts for build verification?)
  • 在偏移之后向FileOutputStream写一个数字(Write a number to FileOutputStream after an offset)
  • Javascript RegEx仅限数字(无特殊字符)(Javascript RegEx for Numbers Only (no special characters))
  • 我可以在PHP的Linux服务器上运行新的COM(“WScript.Shell”)(Can I run new COM(“WScript.Shell”) on linux servers in PHP)
  • 在Backbone.js中放置与视图不直接相关的代码(例如广告,分析等)的位置?(Where to place code not directly related to a view (such as ads, analytics etc) in Backbone.js?)
  • 适用于Mac和PC的一般开发人员必备软件综合列表[关闭](Comprehensive List of Essential Software for General Developers on Mac and PC [closed])
  • 将语言文本转换为模型(Transformation Language Text to Model)
  • 在控制器中的两个方法之间共享变量?(Sharing variables between two methods in a controller?)