若照伊的建議揣贊助,一个工程師改網站,一个負責FB、線跤活動,按呢itaigi較有法度發揮閣較大的影響。
GitHub
8k-tdnnf · twgo/gu2-im1_pian7-sik4_offline@0d5e630
Contribute to twgo/gu2-im1_pian7-sik4_offline development by creating an account on GitHub.
GitHub
DNN-train 的tdnnf-219-16k無走 · Issue #150 · twgo/twgo-exp
我tang-tse push tdnnf-220-16k tdnnf-219-16k 結果220走兩擺(38,, 39)
GitHub
twgo/siann1-hak8_boo5-hing5_nnet3
nnet3聲學模型訓練. Contribute to twgo/siann1-hak8_boo5-hing5_nnet3 development by creating an account on GitHub.
airitilibrary.com
本論文的主要研究為使用語音辨識及結合語音評分,對未整理的台語語料進行初步的篩選。藉由機器先過濾掉有問題的音檔,如錄音音量過小、太多雜訊、錄音音檔內容有誤等情形,取代傳統人工聽測費時的做法。本論文可分為三個階段,分別是:「基礎聲學模型訓練」、「語音評分與錯誤原因標記」及「效能評估」。於基礎聲學模型訓練階段,以長庚大學提供的台語語料ForSD (Formosa Speech Database)為材料,使用隱藏式馬可夫模型(Hidden Markov Model, HMM)進行聲學模型的訓練。聲學模型單位分別為:單音素聲學模型(Monophone acoustic model)、音節內右相關雙連音素聲學模型(Biphone acoustic model)及音節內左右相關三連音素聲學模型(Triphone acoustic model),其針對測試語料進行自由音節解碼辨識網路(Free syllable decoding)的音節辨識率(Syllable accuracy)最佳結果分別為:27.20%、43.28%、45.93%。於語音評分與錯誤原因標記階段,將於基礎聲學模型訓練階段已訓練好的左右相關三連音素聲學模型,對待整理的語料進行語音評分,而將其評分結果依照門檻值分為三部分,分別為低分區、中間值區及高分區。且針對低分區部分語料進行人工標記,標記其錯誤原因,再對其擷取特徵,使用支持向量機(Support Vector Machine, SVM)訓練出分類器,最後以該分類器對低分區語料進行二次檢驗,將低分區語料分為可用語料及不良語料。於效能評估階段,將原先訓練語料分別加入「未整理語料」、「中間值區及高分區語料」、「高分區語料」進行聲學模型的訓練,比較篩選語料前、後效能,其音節辨識率結果分別為:40.22%、41.21%、44.35%。由結果看來,經過篩選後語料所訓練出的聲學模型與未經篩選語料所產生的聲學模型,其辨識率的差別最高可達4.13%,證實本論文所提的方法,藉由語音評分確實能有效的自動篩選掉有問題的語句。This research focuses on validating a Taiwanese speech corpus by using speech recognition and assessment to automatically find the potentially problematic utterances. There are three main stages in this work: acoustic model training, speech assessment and error labeling, and performance <http://evaluation.In|evaluation.In> the acoustic model training stage, we use the For SD (Formosa Speech Database), provided by Chang Gung University (CGU), to train hidden Markov models (HMMs) as the acoustic models. Monophone, biphone (right context dependent), and triphone HMMs are tested. The recognition net is based on free syllable decoding. The best syllable accuracies of these three types of HMMs are 27.20%, 43.28%, and 45.93% <http://respectively.In|respectively.In> the speech assessment and error labeling stage, we use the trained triphone HMMs to assess the unvalidated parts of the dataset. And then we split the dataset as low-scored dataset, mid-scored dataset, and high-score dataset by different thresholds. For the low-scored dataset, we identify and label the possible cause of having such a lower score. We then extract features from these lower-scored utterances and train an SVM classifier to further examine if each of these low-scored utterances is to be <http://removed.In|removed.In> the performance evaluation stage, we evaluate the effectiveness of finding problematic utterances by using 2 subsets of For SD, TW01, and TW02 as the training dataset and one of the following: the entire unprocessed dataset, both mid-scored and high-scored dataset, and high-scored dataset only. We use these three types of joint dataset to train and to evaluate the performance. The syllable accuracies of these three types of HMMs are 40.22%, 41.21%, 44.35% respectively.From the previous result, the disparity of syllable accuracy between the HMMs trained by unprocessed dataset and processed dataset can be 4.13%. Obviously, it proves that the processed dataset is less problematic than unprocessed dataset. We can use speech assessment automatically to find the potential problematic utterances.
YouTube
iptt.sinica.edu.tw
中央研究院智財技轉處
GitHub
正規化方式修改+資料除錯 · Issue #458 · g0v/itaigi
這則我先暫時編寫初步想法,等比較成熟、完整,也有人有空做時,大家再來定案。 這兩項已有具體想法: 正規化google sheet→前端介面讓更多人參與:模仿政治獻金案雙輸入+檢查,可加入放測驗,就不用審核資格。提高正確率、減少背後團隊的工作量、容易參與。 目前資料庫中非正規用字(來源是線頂辭典):用上面同樣的方式來全面除錯。 以下還在想做法: 3. 如何防止「求講法」出現很多錯誤資料?或如何...
GitHub
閩南語臺羅輸入方案,為RIME輸入法所設計. Contribute to a-thok/rime-hokkien development by creating an account on GitHub.
GitHub
add the theme of the form by leo424y · Pull Request #500 · g0v/itaigi
調整為 空值時有提示
GitHub
你的名字沒內容時不應能送出、按發音鈕 · Issue #501 · g0v/itaigi
<https://itaigi.tw/name>
GitHub
你的名字沒內容時不應能送出、按發音鈕 · Issue #501 · g0v/itaigi
<https://itaigi.tw/name>
help.github.com
Creating a pull request from a fork - User Documentation
If you've forked a repository and made changes to the fork, you can ask that the upstream repository accept your changes by creating a pull request. …
web.pcc.gov.tw
[機關名稱]彰化縣文化局[標案名稱]前瞻基礎建設推動藝文專業場館升級計畫-彰化縣政府整建計畫-台語文創意園區委外經營管理[標案案號]CHCAB108-008
GitHub
Rime台語輸入法詞表 (Taiwanese Input Schema for Rime). Contribute to i3thuan5/rime-taigi development by creating an account on GitHub.
在這邊請教一個問題,維基百科的內文可以合法爬下來當語料嗎?有沒有什麼方法,可以針對其中的某個分類,爬該分類的所有文字內容?一樣是當作語料使用。
facebook.com
尾牙 相揪吃桌 感謝臺灣. 393 likes · 607 talking about this. Product/Service
followculture.strikingly.com
辦桌不僅僅是一群人的聚餐,而是重要的時間、對的人一起享用美食,「辦桌謝平安」是最佳寫照,遠方親友來參與謝天地的重要時刻,歡聚感恩,更在食物上呈現傳統與流行。