Additionally, they can employ plug-ins to provide different styles and layouts for printing purposes. Since most of the current programming languages support Unicode, application programmers and music analysts can easily build algorithms for the music information retrieval or computer-based analysis of the iSargam music databases. The next section gives details about Sargam notation system, and the section following provides details of related works done elsewhere. We then describe the iSargam encoding system, explaining its approach and encoding algorithm.
For increasing readability of western readers, we give comparisons with western music concepts wherever applicable. We also present some example encoding compositions as proof of our approach. Sargam notation is a music notation language for Carnatic music. Each notation starts with specification of raga, tala, and mela. Sometimes, the notes used in ascending scale and descending scale, known as arohana and avarohana, are explicitly defined in the start of the composition. This is followed by the time signature and actual music notation following it as illustrated in Fig.
In this section, we attempt to describe the terminologies used and the notation scheme. Raga [ 17 ] is one of the most distinguished features of Carnatic music. The raga can be defined by a melodic scheme characterized by a definite scale or notes, order or sequence in which the notes can be used, melodic features, pauses and stresses, and tonal graces.
Some ragas define the same set of music notes swara but are still differentiated by some other features like the order of appearance of swara, melodic punctuation, accent, intonation, and melodic phrases. The raga in Carnatic music is analogous to key signatures in western music.
A related field is the specification of arohana and avarohana. There are hundreds of defined rhythm styles talas in Indian music. The name of the tala used in the notated composition is given above the notation as in Fig. Now, we attempt to describe the tala system explaining its elements and symbols. This is similar to various note types like crotchets and semiquavers in western music notation.
Also, each element anga has a defined reckoning mode. The reckoning mode specifies the actual delivery of the rhythm. The basic rhythm used is notated at the beginning of the notation, and notes are grouped according to it as seen in Fig. The grouping method used is similar to grouping notes according to time signature as in western music notation but with bars of different measure. Mela or Melakartas [ 17 ] are parent ragas from which the other ragas evolved. There are twenty-two of them. A distinguished feature of the mela or parent ragas is that they contain all the seven notes in order.
The melakartas have a numbering scheme and are identified by the number. The parent raga of the composition is so indicated by an integer number as illustrated in Fig. The notation style is primarily script-based where the note symbols are placed on a straight line. Suitable signs and symbols are also used to indicate various other musical features. A detailed review of the Sargam transliteration scheme can be found in [ 18 ].
As shown in Fig. The symbols used in notation can be classified as music note swara , gamaka symbols, and other symbols. The following section details the concept and notation of musical note, gamaka, and other symbols. A swara or music note usually denotes the note name indicating the pitch, duration, octave, and whether it is played with expressions called gamaka or not.
The notes are named differently according to its pitch as shadja sa , rishabha ri , gandhara ga , madhyama ma , panchama pa , dhaivata dha , and nishada ni and is abbreviated as given in brackets.
The current style of written forms is with vowels removed and expressed with single letters as shadja S or s , rishabha R or r , gandhara G or g , madhyama M or m , panchama P or p , dhaivata D or d , and nishada N or n. In Indian music, there are five referred octaves, with a middle octave and two upper and two lower octaves.
The first upper octave is denoted by adding a dot below the music note and the second upper octave denoted by adding two dots below the music note as shown in Fig. Similarly, the immediate lower octave is denoted by adding a dot above the music note and the next lower octave is denoted by adding two dots above the music note as shown in Fig.
A musical note shadja with a mid octave, b lower octave, and c two octaves higher. To denote the duration of a music note, uppercase or lowercase letters with or without comma, semicolon, or with underline or over line are used. A swara letter in lowercase indicates one aksharakala duration, and an upper case swara letter indicates two aksharakala duration.
A comma placed near a music note increases its duration by one aksharakala and a semicolon by two. Similarly, a single horizontal line over the swara reduces the swara duration to its half and double over or under line reduces it to its quarter. The duration of a rest note is indicated using the necessary number of semicolon or comma symbols placed inside simple parenthesis, e.
In Carnatic music, each music note swara can represent more than one pitch value, usually two, according to the raga followed by the composition. Generally, there are no special signs or symbols to represent the variety of the note. This information is implicitly associated with the raga of the song. There are a few special symbols used in the notation scheme for denoting some musical features like articulation, ornaments etc. All of them are notated using symbols attached with the swara letter. This includes symbols for an ascending or descending glide, foreign note, stressed note, repeat symbols, and gamaka mark.
Rarely, some notes which are not part of the raga specification are used, and such notes are represented by an asterisk mark over the swara symbol. The repeat symbol, usually found at the end of an avarta measure denotes that the portion of music should be repeated. The gamaka mark, represented by a tilde symbol over the swara symbol symbolizes ornamentation which is of utmost importance to Indian music. A music phrase, unlike in western music, represents a set of musical notes which has to be sung together in one breath duration and is symbolized by hyphens at the start and end of the phrase.
The music notes with adjoint symbols are written on a straight line similar to tonic solfa notation in western music. The music notes are then grouped according to the rhythm structure tala of the composition, which is similar to the grouping of notes with time signatures in western music notation.
Here, we explain the grouping mechanism in comparison with grouping in western notation for easy understanding of readers. In western notation, notes are grouped according to the indicated time measure to form equi-measured bars as demonstrated in Fig. In Carnatic music, the grouping or structuring of music notes is done according to the rhythm pattern tala.
The tala rhythm specification consists of a set of basic elements angas , each with a specific duration. The basic rhythm pattern repeats over the entire composition. The music notes are grouped in such a way that the total duration of the music notes is equal to the corresponding anga duration as illustrated in Fig.
This shows that Indian music follows variable measured bars as opposed to equi-measured bars in western music. This is illustrated in Fig. Music representation systems encompass musical information in any of the three levels: sound, music notation, or data for analysis [ 20 ]. Music notations are generally an encoding of abstract representations of music. They contain instructions for performance and representation of sound. Music representation systems can be classified as audio signal representations, resulting from the recording of sound sources or from direct electronic synthesis, and symbolic representations which represent discrete musical events such as notes, rhythm etc.
The proposed system is a symbolic representation, and is content-aware and can relate musical events to formalized concepts of Carnatic music theory. The musical representation systems can also be classified according to the encoding system format used for storing the information. They can be further classified as record-based, command-based, codes, and LISP-based. Also, many popular score-writing programs like Rhapsody and Sibelius use proprietary formats.
Unlike these extensions of western music, the proposed system is a unique approach to representation of South Indian Carnatic music based on Indian music theory. That means we use various Unicode symbols to represent musical entities in Carnatic music. In this section, we describe the iSargam representation system by explaining its approach, encoding logic, and algorithm. The musical symbols used in Sargam notation are classified as singleton or grouped entity according to whether they have meaning or sense in single form or they make sense only when they combine with another musical entity.
For example, anumandra, the octave specification symbol makes sense only when it is joined with a musical note Swara. Singleton musical entities are always found independent in the notation and have semantics of their own. This classification among music symbols is required due to difference in encoding single and group entities, where group entity symbols can be encoded together only and not individually. A music constituent is the most basic unit of music. Here, it consists of a pitch symbol swara , a sthayi octave , and duration information.
It may be noted that a rest note does not have pitch and octave but has duration. Thus, the basic constituent elements are swara and rest. So, the general syntax of a pitched music constituent can be defined as. It may be noted that the swara syllable alone is a complete musical constituent since it already contains octave and duration information.
The basic element can be grouped according to some rhythmic pattern or it can be further augmented with additional symbols or other music notes, forming various grouped entities. In our approach, the former is called rhythmic group and the latter is called notational grouping.
This latter is again classified into intra notational and inter notational group entities. Inter notational groupings are always associated with music constituents. Intra notational grouping occurs when the music constituent is further augmented by adding parameters which apply in a single note level. This is denoted by adding extra signs or symbols to the base syllable. In case of inter notational grouping, multiple musical notes are grouped together, mostly to give a musical expression such as a musical phrase and ascending or descending glides. Having defined the basic terminologies, now we attempt to present our encoding logic.
Initially, our system maps every Sargam notation symbol to a Unicode symbol. The chosen Unicode character resembles the Sargam notation symbol used. Each symbol is also assigned a priority number. The encoding logic depends on whether the music symbol is a singleton or grouped entity. So, for encoding of a notated composition, we take every symbol and check if it can be further split into different characters as illustrated in Fig.
This is done by checking the baseline and upperline of the character. An atomic symbol is a singleton entity and it is directly mapped to its representation Unicode symbol.
Authors: Chakraborty, S., Mazzola, G., Tewari, S., Patra, M. The book opens with a short introduction to Indian music, in particular classical Hindustani music, followed by a chapter on the role of statistics in computational musicology. Finally, they explain how the concept of. The book opens with a short introduction to Indian music, in particular classical Hindustani music, followed by a chapter on the role of statistics in computational .
If the character is a grouped entity, then it might be a Swara symbol indicating a pitched note or it is a rest note, where both are music constituents. We process the constituent symbols together with a priority queue [ 35 ].
The priority queue orders them according to the preassigned symbol priority and thus produces unambiguous encoding. The iSargam system chooses the unique numbers carefully so as to make sure that the corresponding Unicode character almost fully resembles the actual music notation in appearance, even in the case of the grouping or joining of music notation symbols.
The advantage here is that Unicode symbols appear discrete in encoding, which favors easy identification of music entities for music processing, but in appearance, it appears joined, resembling the original notation. Also, it may be noted that in such a representation, a combined notation can be easily split to its constituent basic music entities.
We use Unicode full width forms for standalone music elements like swara syllable or duration, and Unicode combining diacritical marks for adjunct symbols like octave, stress, foreign note indication, duration symbols which symbolize duration less than one unit, etc. More specifically, all the intra notational symbols and violin marks are represented by combining diacritical marks. Additionally, we use Unicode full width symbols for representing symbols in the rhythmic group like anga, avarta symbols, which are analogous to measure and bar markings for western music, and other rhythm-specific elements like laghu, plutum etc.
The encoded file consists of various sections, viz, the header, rhythm markup, and actual composition, explained as follows. The header section accompanies every music notation. It mainly consists of two sections, viz, a compulsory part and an optional part. The compulsory part is known as the music description part, and it specifies the most important elements for interpreting the notation. These most important fields are raga name and tala name. The header elements are considered as keywords which are case-insensitive and are separated by a colon character.
These values are case-sensitive. The tala section marks the rhythm pattern of the composition. Usually, a rhythm pattern is defined as a combination of its basic elements called angas, as described in the previous section. This section contains the notation of the actual composition.
It consists of music constituents along with required signs and symbols with notational and rhythmic grouping. The encoding of the actual music notation is illustrated in the following subsections. The music notes are grouped according to the tala specification, splitting them into many anga and avarta as described in the previous section. Sometimes, a repeat symbol is inserted in front of the avarta end to denote repetition of an avarta.
The encoding strategy followed for a pitched music note and an unpitched note is different. The encoding logic for an unpitched note is straightforward like a singleton entity. But a pitched note is regarded as a grouped entity and we use priority-based encoding for this set as illustrated in Fig. Unlike the encoding strategy used for singleton entities, encoding for grouped entities is done together and not as individual elements. The paper also includes a short discussion of our research project AUTRIM automated transcription system for Indian music , developed in collaboration with Prof.
Wim van der Meer of University of Amsterdam. T M Krishna and V. Ishwar : "Carnatic Music : Svara, Gamaka, Motif and Raga Identity" [ article , slides , video ] Over the last century, in Karnatic music, the method of understanding raga has been, to break it down into its various components, svara, scale, gamaka and phrases. In this paper, an attempt is made to define the abstract concept of raga in its entirety, within the Karnatic aesthetic, considering the various factors and importantly, the symbiotic relationship between, svara, gamaka and phraseology.
This paper also attempts to prove that the identity of a raga, exists as a whole. We first illustrate the concept of fundamental musical note or svara and we deal with the concept of gamaka or inflexions. We then discuss the concept of a raga and its identity in terms of svara, gamaka and phraseology. The region on the scale where the melody begins seems to be an important feature for discriminating makams using especially the same intervallic order or scale.
It appears that, to clarify the distinction of such makams which use the same scale within this context, a semiotics approach can contribute to better understanding of such characteristics. It is proposed here that performing an analytical study based on observation of melody initiation points and based on the concepts defined by C. Via this approach, it could be possible to derive some features to capture that characteristic and further use it for computer based analysis of discriminating features of such makams. Sordo, G. Koduri, S. Gulati and X.
As a way to demonstrate the usefulness of the methods we are also developing a system to browse and interact with specific audio collections. The system is an online web application that interfaces with all the data gathered audio, scores plus contextual information and all the descriptions that are automatically generated with the developed methods.
In this paper we present the basic architecture of the proposed system, the types of data sources that it includes, and we mention some of the culture specific issues that we are working on for its development. The system is in a preliminary stage but it shows the potential that MIR technologies can have in browsing and interacting with music collections of various cultures. Babacan, C. Frisson and T. Dutoit: "Improving the Understanding of Turkish Makam Music through the MediaCycle Framework" [ article , video ] The goal of this work is to investigate the challenges of creating a tool to aid people of diverse profiles, from musicology experts and music information retrieval MIR specialists, to the interested non-technical users outside these fields in understanding traditional makam music of Turkey.
We aim at providing a playground approach, with which MIR specialists can easily validate algorithms for feature extraction, clustering and visualization, and non-technical users can navigate by easily varying parameters and triggering audiovisual previews. We adapted the MediaCycle framework for organization of media files by similarity. AudioCycle, its audio application, allows users to cluster a large number of audio files against a subset of extracted audio features, visualized in a 2D space through positions, distances, colors. Transitions between parametric changes are animated, which helps the user create and retain a mental model of the sounds and their relationships.
For our proof-of-concept, we defined our use case as detecting makamlar plural from makam music. We integrated the pitch histogram technique proposed by Bozkurt et. Therefore, the debates on these topics and research attempts for such a system have not been finalized. Within the frame of these attempts, the starting point should be thorough analysis of performances to eliminate the disparities between the theory and the performance. Accurate evaluation of the results of such analysis might leads to a theory that has roots from the performance and enable us to describe a system that is coherent with the performance.
Thus, the technology should be utilized for the analysis of the audio recordings from well-known performers. In this work, au- dio recordings, which are recorded by the present day per- formers specifically for this research, and audio recordings from past masters are analyzed. The results of the analysis are demonstrated and compared to the theoretical values Holder comma values.
The effects of the performance on the melodic structure are also investigated. Lartillot and M. Through a dialogue between anthropological survey, musical analysis and cognitive modeling, one main objective is to bring to light the psychological processes and interactive levels of cognitive processing underlying the perception of modal structures in Maqam improvisations. One current axis of research in this project is dedicated to the design of a comprehensive modeling of the analysis of maqam music founded on a complex interaction be-tween progressive bottom-up processes of transcription, modal analysis and motivic analysis and the impact of top-down influence of higher-level information on lower-level inferences.
Another ongoing work attempts at formalizing the syn-tagmatic role of melodic ornamentation as a Retentional Syntagmatic Network RSN that models the connectivity between temporally closed notes.
We propose a specifica-tion of those syntagmatic connections based on modal context. A computational implementation allows an au-tomation of motivic analysis that takes into account me-lodic transformations. The ethnomusicological impact of this model is under consideration. Srinivasamurthy and P. In this paper, we extend the capabilities of this system to encode Carnatic music and propose a unified system for Indian classical music.
It enables us to systematically encode Carnatic music compositions into a machine readable format. The linear text-based intermediate representation for data entry is also extended to encode additional metadata useful in Carnatic music. The representation system will be useful for symbolic music research, generation of synthetic melodies, and comparative analyses.
Bozkurt and M.
In this paper, first a review of an n-gram based approach is presented using various representations of the symbolic data. While a high degree of precision can be obtained, confusion happens mainly for makams using almost the same scale and pitch hierarchy but differ in overall melodic progression, seyir. To further improve the system, first n-gram based classification is tested for various sections of the piece to take into account a feature of the seyir that melodic progression starts in a certain region of the scale.
In a second test, a hierarchical classification structure is designed which uses n-grams and seyir features in different levels to further improve the system. Font and X. Serra: "Analysis of the Folksonomy of Freesound" [ article , slides , video ] User generated content shared in online communities is often described using collaborative tagging systems where users assign labels to content resources. As a result, a folksonomy emerges that relates a number of tags with the resources they label and the users that have used them.
In this paper we analyze the folksonomy of Freesound, an online audio clip sharing site which contains more than two million users and , user-contributed sound samples covering a wide variety of sounds.
This study is such an attempt where we address the characteristics for makams as defined in theory books and deduce a list of quantitative features. In statistical modeling, both are involved—as we first describe a pattern through modeling and then infer about its validity. PAGE 1. Keywords: Self-localization, Low rank approximation, Time-difference of arrival, Ad-hoc array, interpolation, filter design, Fir, Multistage, Decimation, resonance, Audio segmentation based on melodic style with hand-crafted features and with convolutional neural networks Amruta Vidwans , Nachiket Deo , Preeti Rao. Three different strategies to detect tonic, namely, the concert method, the template matching and segmented histogram method are proposed.
By following methodologies taken from similar studies, we compute some metrics that characterize the folksonomy both at the global level and at the tag level. In this manner, we are able to better understand the behavior of the folksonomy as a whole, and also obtain some indicators that can be used as metadata for describing tags themselves. We expect that such a methodology for characterizing folksonomies can be useful to support processes such as tag recommendation or automatic annotation of online resources.
Sordo, J. Koduri and X. Serra: "A Method for Extracting Semantic Information from on-line Art Music Discussion Forums" [ article , slides , video ] In this paper a method for extracting semantic information from online music discussion forums is proposed. The semantic relations are inferred from the co-occurrence of musical concepts in forum posts, using network analysis. The method starts by defining a dictionary of common music terms in an art music tradition. Then, it creates a complex network representation of the online forum by matching such dictionary against the forum posts.
Once the complex network is built we can study different network measures, including node relevance, node co-occurrence and term relations via semantically connecting words. Moreover, we can detect communities of concepts inside the forum posts. The rationale is that some music terms are more related to each other than to other terms. All in all, this methodology allows us to obtain meaningful and relevant information from forum discussions.
Bozkurt: "Features for Analysis of Makam Music" [ article , slides , video ] For computational studies of makam music, it is essential to gather a list of characteristics that constitute a makam and explore corresponding quantitative features for automatic analysis. This study is such an attempt where we address the characteristics for makams as defined in theory books and deduce a list of quantitative features.
The target here is to evoke discussions on some measurable features other than providing complete analysis on the discriminative potentials of each proposed feature which could be the subject of a few larger studies. Sarala, V. Ishwar, A.
Bellur and H. Murthy: "Applause Identification and its Relevance to Archival of Carnatic Music" [ article , slides , video ] A Carnatic music concert is made up of a sequence of pieces, where each piece corresponds to a particular genre and raaga melody. Unlike a western music concert, the artist may be applauded intra-performance, inter-performance. Most Carnatic music that is archived today correspond to a single audio recordings of entire concerts. The purpose of this paper is to segment single audio recordings into a sequence of pieces using the characteristic features of applause and music.
Spectral flux, spectral entropy change quite significantly from music to applause and vice-versa. The characteristics of these features for a subset of concerts was studied. A threshold based approach was used to segment the pieces into music fragments and applauses. Simpler bols can be played simultaneously to create compound bols and sequential placement of such bols leads to the formation of rhythms. The systematic permutation and combination of bols generate recurring accent markers that contribute to the perceived metricality of a sequence. Principles of Auditory Scene Analysis can help in understanding how metrical structure is established based on timbre similarities and differences.
Strategies adopted by Hindustani musicians in reinforcing rhythms at different tempi can also be explained by phenomena such as stream segregation. At slow speeds vilambit laya , performers use filler sounds to maintain perceptual connectivity. At high speeds drut laya perceptual segregation of compound bols into simpler components creates layers that provide scaffolding that helps in maintaining a steady rhythm.
The lens of Auditory Scene Analysis is a productive way to view how the acoustic properties of different bols influence their interaction as a function of tempo. It also offers insights about the strategic leveraging of such relations in tabla performance and the perception of emergent rhythms by listeners. Srinivasamurthy, S.
Subramanian, G. Tronel and P. We present an algorithm that uses a beat similarity matrix and inter onset interval histogram to automatically extract the sub-beat structure and the long-term periodicity of a musical piece. The accuracy on the difficult CMDB was poorer with Holzapfel and B. Bozkurt: "Metrical Strength and Contradiction in Turkish Makam Music" [ article , slides , video ] In this paper we investigate how note onsets in Turkish Makam music compositions are distributed, and in how far this distribution supports or contradicts the metrical structure of the pieces, the usul.
We use MIDI data to derive the distributions in the form of onset histograms, and compare them with metrical weights that are applied to describe the usul in theory. We compute correlation and syncopation values to estimate the degrees of support and contradiction, respectively. While the concept of syncopation is rarely mentioned in the context of this music, we can gain interesting insight into the structure of a piece using such a measure.
We show that metrical contradiction is systematically applied in some metrical structures. We will compare the differences between Western music and Turkish Makam music regarding metrical support and contradiction. Such a study can help avoiding pitfalls in later attempts to perform audio processing tasks such as beat tracking or rhythmic similarity measurements.
Demoucron, S. Weisser and M. Leman: "Sculpting the Sound. Timbre-Shapers in Classical Hindustani Chordophones" [ article , video ] Chordophones of the contemporary classical Hindustani tradition are characterized by the presence of one or both of these two specific devices: the sympathetic strings taraf from about 10 to over 30 and the curved wide bridge jawari sometimes reinforced by a cotton thread.
Based on field recordings and interviews, this study aims to quantify the contribution of taraf strings and wide curved bridge jawari to the global sound of the different instruments and settings. Acoustical analyses are correlated with ethnomusicological analyses, in order to evaluate the tarafs and jawaris aesthetic, musical and perceptual role.
Serra and J. Arcos: "Signal Analysis of Ney Performances" [ article , video ] Ney is an end-blown flute which is mainly used for Makam music. Although from the beginning of 20th century a score representation based on extending the Western music is used, because of its rich embellishment repertoire, actual Ney music can not be totally represented by written score.