Song Scope軟件在通過聲音自動識別動物物種中的應用
Abstract
Commercially available autonomous recorders for monitoring vocal wildlife populations such as birds and frogs now make it possible to collect thousands of hours of audio data in a field season. Given limited resources, it is not practical to manually review this volume of data “by ear”. The automatic processing of sound recordings to detect and identify specific species from their vocalizations, even if not perfectly accurate, makes efficient use of researchers who review only those samples most likely to contain vocalizations of interest. This results in significant gains of sample coverage, operating efficiency, and cost savings.
Developing generalized computer algorithms capable of accurate species identification in real-world field conditions is full of difficult challenges. First, recordings made by autonomous recorders typically receive sounds from all directions, scattered and reflected by trees, obscured by an unpredictable constellation of random noise, wind, rustling leaves, airplanes, road traffic, and other species of birds, frogs, insects and mammals. Second, the vocalizations of many species are highly varied from one individual to the next. Any algorithm must be prepared to accept vocalizations that are similar, but not identical, to known references in order to successfully detect the previously unobserved individual. However, in so doing, the algorithm is then susceptible to misclassifying a vocalization from a different species with similar components. This is especially true for species with narrowband vocalizations lacking distinctive spectral properties and in species with short duration vocalizations lacking distinctive temporal properties.
The bulk of prior research has generally differentiated among only a handful of simple mono-syllabic vocalizations at a time. While the results have been promising, we found that many approaches degrade significantly as the number of species increases, especially when more complex multi-syllabic and highly variable vocalizations are also included.
In this paper, we discuss an algorithm based on Hidden Markov Models automatically constructed so as to consider not just the spectral and temporal features of individual syllables, but also how syllables are organized into more complex songs. Additionally, several techniques are employed to reduce the effects of noise present in recordings made by autonomous recorders.
摘要:
用于監測鳥類和青蛙等有聲野生動物種群的商用自動記錄儀現在可以在野外季節收集數千小時的音頻數據。鑒于資源有限,“憑耳朵”手動審查如此大量的數據是不切實際的。自動處理錄音以從特定物種的叫聲中檢測和識別它們,即使不是完全準確,也能有效地利用只審查最有可能包含感興趣叫聲的樣本的研究人員。這大大提高了樣本覆蓋率、運營效率和成本節約。
開發能夠在現實世界的野外條件下準確識別物種的通用計算機算法充滿了艱巨的挑戰。首先,自動錄音機的錄音通常會接收來自各個方向的聲音,這些聲音被樹木散射和反射,被不可預測的隨機噪聲、風、沙沙作響的樹葉、飛機、道路交通和其他鳥類、青蛙、昆蟲和哺乳動物的星座所掩蓋。其次,許多物種的叫聲因個體而異。任何算法都必須準備好接受與已知參考相似但不完全相同的發音,以便成功檢測到以前未觀察到的個體。然而,在這樣做的過程中,該算法很容易對來自具有相似成分的不同物種的發音進行錯誤分類。對于缺乏獨特光譜特性的窄帶發聲物種和缺乏獨特時間特性的短時發聲物種來說尤其如此。
之前的大部分研究通常一次只區分了少數簡單的單音節發音。雖然結果很有希望,但我們發現,隨著物種數量的增加,許多方法會顯著退化,特別是當還包括更復雜的多音節和高度可變的發音時。
本文討論了一種基于隱馬爾可夫模型的自動構建算法,該算法不僅考慮了單個音節的頻譜和時間特征,還考慮了音節如何組織成更復雜的歌曲。此外,還采用了幾種技術來減少自主錄音機錄制的錄音中存在的噪聲的影響。
關鍵詞:Song Scope軟件,聲音采集軟件,野生動物聲音監測,鳥鳴監測記錄