<![CDATA[std::steam]]>https://jehoshaphatia.hashnode.devRSS for NodeSat, 09 Nov 2024 11:32:00 GMT60<![CDATA[100 Days Of ML Code — Day 100]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-100-f8a8f959616https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-100-f8a8f959616Sat, 27 Oct 2018 13:46:01 GMT<![CDATA[<p>100 Days Of ML Code Day 100</p><h2 id="recap-from-day-099"><strong>Recap from day 099</strong></h2><p>In day 099 we looked at something a little bit different but actually goes back to some of what we were talking about when we looked at timbre which is essentially how we create these frequency representations of sound that were looking at when were talking about timbre, the sonogram and the spectral view.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-099-db5c47620ea4"><strong>100 Days Of ML Code Day 099</strong><em>Recap from day 098</em>medium.com</a></p><p>Today, well continue from where we left off In day 099</p><h2 id="the-process"><strong>The Process</strong></h2><p>Windowing is when we take a waveform and split it up into tiny little bits. Then, we take each of those tiny little bits and we do this thing called Periodicization. Theres really nothing to this we just pretend that little bit repeats infinitely so that its a periodic sample and then on each of those little windows we apply a method called the Fast Fourier Transform which youll often see abbreviated as FFT and so we apply this process in order to convert our time domain set of amplitudes values into information about frequency. So, Im going to go through each of those steps in more detail.</p><h2 id="windowing"><strong>Windowing</strong></h2><p>The first step is Windowing so what were we do is divide the audio into equal size, overlapping frames. So, let me show you what I mean. We pick a number of samples that would be included in each frame. So, our frame size might be 1024 samples, for instance. So these are tiny frames. So 1024 samples if our sampling rate were 44,100 hertz is about 140th of a second. So tiny fractions of a second.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823887991/2eEbGLoYD.png" alt /></p><p>And so if we were taking the waveform seen above and splitting it up we might have the first red line that Ive drawn under the waveform to be one and then were going to overlap them with each other. So the second red line under the first one might be another, the third line which is the line above the second line might be another and so on and so forth all the way through our file.</p><p>But its more complicated than what Ive shown above actually because those are overlapping and we want smooth transitions from one to the next as were doing it, each of them kind of fades in and fades out.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823889818/Nap3MXg8v.png" alt /></p><p>So the first one Im going to fade in fade out with an amplitude envelope as represented by the green annotation as seen above. The next one well fade in and fade out too, and so on. So theres always one thats kind of fading in and always one thats kind of fading out with an overlap like the one seen above, and so on and so forth.</p><p>So thats what windowing is, we end up with these windows that kind of fade in and fade out that are each a tiny fraction of a second long then we take each of those windows and, this is the easy part, we pretend that its a periodic function.</p><h2 id="periodicization"><strong>Periodicization</strong></h2><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823891798/etCD7cVQW.png" alt /></p><p>So we take a tiny little window like image A above, and we repeat it and we repeat it and we repeat it and we repeat and we repeat it and again, and again like in image B above. We just pretend that the repetition goes on forever. So now weve met the periodic requirement of the Fourier Theorem.</p><h2 id="the-fast-fourier-transform"><strong>The Fast Fourier Transform</strong></h2><p>The final step is called The Fast Fourier Transform. The details of how this algorithm works are a little bit beyond the scope of this article. I encourage you to look up some more details if youre interested, Ill point you towards some references, but right now I just want to explain about, kind of pretend that its a black box. And explain kind of what goes in and what goes out.</p><p>What comes in are these amplitude samples overtime in the frame. So if our frame size is 1,024, wed have 1,024 amplitude values that would go in. And what would come out are a set of amplitudes and phases for each frequency bin.</p><p>So in other words, Im going to divide up my frequency space into a series of linearly spaced bins and then Im going to look at whats going on in each of those. How much energy is there in each of those bins? And also the phase of the sine wave its represented by each of those bins.</p><p>There are some simple ways to calculate how the algorithm does this and my number of frequency bins is half of my frame size and then the width between each of these bins from one to the next to the next is my Nyquist frequency, the highest frequency I can represent in my sampling rate, divided by my number of bins.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823893577/PotHUCYeT.png" alt /></p><p>Lets work through an example here just to make sure this is totally clear. So my frame size is 1,024 samples and my sampling rate is 44,100 Hertz then my Nyquist frequency would be 44,100 divided by 2 so 22,050. So then my number of bins is the frame size, 1024 divided by two. So thats 512, and my bin width is going to be my Nyquist frequency thats 22,050 Hertz divided by my number of bins, 512 This comes out to about 43 Hertz. Its a little bit more than 43 Hertz. So that means that my frequency bins are going to be spaced zero, 43, 86, 129, so on and so forth all the way up to 22,050 Hertz.</p><p>So thats how this stuff is divided up and then I have information at that point about whats going on in each of those frequency areas and so you can see how it could generate a sonogram from there. I could take each of these frames and generate one vertical strip of frequency view in my sonogram based on that data thats coming back</p><h2 id="issues-and-tradeoffs"><strong>Issues and Tradeoffs</strong></h2><p>I want to talk about some of the issues with the process described above because it is not a perfect process.</p><p>First of all, its a Lossy process, I lose data in this process. If I do this fast Fourier Transform and then I go back to my waveform Ive lost something in the process because Ive split these things up into these linear frequency bins so I only know whats happening with a very low resolution as theyre moving up in frequency and I also only know things about a fairly low resolution in terms of time because I only know whats happening frame by frame by frame so 1,024 samples in the example weve been using at a time and so theres actually a big trade-off when I pick my frame size.</p><p>In terms of how much resolution do I want in a time-domain versus how much do I want in the frequency domain if I want to know exactly when things are happening in time along my x-axis, I can pick a very low frame size so my frames are really tiny so I get a lot of time resolution or horizontally but then my bin width gets huge and so I know very little about whats happening vertically in my frequency dimension.</p><p>If I want to know a lot vertically in my frequency dimension, I can pick a really high frame size but then theres a lot of time that passes from one frame to the next to the next and so I lose a lot of resolution in the horizontal in the time domain.</p><p>The one point I wanted to make is that the frequency space is divided linearly but if you remember from psychoacoustics, we actually hear a pitch not linearly but logarithmically and so a lot of linear frequency bins are kind of wasted if you will on things very high up in frequency space. So half of the bins are for what we would hear as just the final octave of our frequency space so this isnt a great match either, but thats how this particular algorithm works.</p><p>Wow, youre still here. Its the 100th day. You deserve some accolades for hanging in here till the end. I hope you found the journey from day 001 to day 100 informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 100</p><h2 id="recap-from-day-099"><strong>Recap from day 099</strong></h2><p>In day 099 we looked at something a little bit different but actually goes back to some of what we were talking about when we looked at timbre which is essentially how we create these frequency representations of sound that were looking at when were talking about timbre, the sonogram and the spectral view.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-099-db5c47620ea4"><strong>100 Days Of ML Code Day 099</strong><em>Recap from day 098</em>medium.com</a></p><p>Today, well continue from where we left off In day 099</p><h2 id="the-process"><strong>The Process</strong></h2><p>Windowing is when we take a waveform and split it up into tiny little bits. Then, we take each of those tiny little bits and we do this thing called Periodicization. Theres really nothing to this we just pretend that little bit repeats infinitely so that its a periodic sample and then on each of those little windows we apply a method called the Fast Fourier Transform which youll often see abbreviated as FFT and so we apply this process in order to convert our time domain set of amplitudes values into information about frequency. So, Im going to go through each of those steps in more detail.</p><h2 id="windowing"><strong>Windowing</strong></h2><p>The first step is Windowing so what were we do is divide the audio into equal size, overlapping frames. So, let me show you what I mean. We pick a number of samples that would be included in each frame. So, our frame size might be 1024 samples, for instance. So these are tiny frames. So 1024 samples if our sampling rate were 44,100 hertz is about 140th of a second. So tiny fractions of a second.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823887991/2eEbGLoYD.png" alt /></p><p>And so if we were taking the waveform seen above and splitting it up we might have the first red line that Ive drawn under the waveform to be one and then were going to overlap them with each other. So the second red line under the first one might be another, the third line which is the line above the second line might be another and so on and so forth all the way through our file.</p><p>But its more complicated than what Ive shown above actually because those are overlapping and we want smooth transitions from one to the next as were doing it, each of them kind of fades in and fades out.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823889818/Nap3MXg8v.png" alt /></p><p>So the first one Im going to fade in fade out with an amplitude envelope as represented by the green annotation as seen above. The next one well fade in and fade out too, and so on. So theres always one thats kind of fading in and always one thats kind of fading out with an overlap like the one seen above, and so on and so forth.</p><p>So thats what windowing is, we end up with these windows that kind of fade in and fade out that are each a tiny fraction of a second long then we take each of those windows and, this is the easy part, we pretend that its a periodic function.</p><h2 id="periodicization"><strong>Periodicization</strong></h2><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823891798/etCD7cVQW.png" alt /></p><p>So we take a tiny little window like image A above, and we repeat it and we repeat it and we repeat it and we repeat and we repeat it and again, and again like in image B above. We just pretend that the repetition goes on forever. So now weve met the periodic requirement of the Fourier Theorem.</p><h2 id="the-fast-fourier-transform"><strong>The Fast Fourier Transform</strong></h2><p>The final step is called The Fast Fourier Transform. The details of how this algorithm works are a little bit beyond the scope of this article. I encourage you to look up some more details if youre interested, Ill point you towards some references, but right now I just want to explain about, kind of pretend that its a black box. And explain kind of what goes in and what goes out.</p><p>What comes in are these amplitude samples overtime in the frame. So if our frame size is 1,024, wed have 1,024 amplitude values that would go in. And what would come out are a set of amplitudes and phases for each frequency bin.</p><p>So in other words, Im going to divide up my frequency space into a series of linearly spaced bins and then Im going to look at whats going on in each of those. How much energy is there in each of those bins? And also the phase of the sine wave its represented by each of those bins.</p><p>There are some simple ways to calculate how the algorithm does this and my number of frequency bins is half of my frame size and then the width between each of these bins from one to the next to the next is my Nyquist frequency, the highest frequency I can represent in my sampling rate, divided by my number of bins.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823893577/PotHUCYeT.png" alt /></p><p>Lets work through an example here just to make sure this is totally clear. So my frame size is 1,024 samples and my sampling rate is 44,100 Hertz then my Nyquist frequency would be 44,100 divided by 2 so 22,050. So then my number of bins is the frame size, 1024 divided by two. So thats 512, and my bin width is going to be my Nyquist frequency thats 22,050 Hertz divided by my number of bins, 512 This comes out to about 43 Hertz. Its a little bit more than 43 Hertz. So that means that my frequency bins are going to be spaced zero, 43, 86, 129, so on and so forth all the way up to 22,050 Hertz.</p><p>So thats how this stuff is divided up and then I have information at that point about whats going on in each of those frequency areas and so you can see how it could generate a sonogram from there. I could take each of these frames and generate one vertical strip of frequency view in my sonogram based on that data thats coming back</p><h2 id="issues-and-tradeoffs"><strong>Issues and Tradeoffs</strong></h2><p>I want to talk about some of the issues with the process described above because it is not a perfect process.</p><p>First of all, its a Lossy process, I lose data in this process. If I do this fast Fourier Transform and then I go back to my waveform Ive lost something in the process because Ive split these things up into these linear frequency bins so I only know whats happening with a very low resolution as theyre moving up in frequency and I also only know things about a fairly low resolution in terms of time because I only know whats happening frame by frame by frame so 1,024 samples in the example weve been using at a time and so theres actually a big trade-off when I pick my frame size.</p><p>In terms of how much resolution do I want in a time-domain versus how much do I want in the frequency domain if I want to know exactly when things are happening in time along my x-axis, I can pick a very low frame size so my frames are really tiny so I get a lot of time resolution or horizontally but then my bin width gets huge and so I know very little about whats happening vertically in my frequency dimension.</p><p>If I want to know a lot vertically in my frequency dimension, I can pick a really high frame size but then theres a lot of time that passes from one frame to the next to the next and so I lose a lot of resolution in the horizontal in the time domain.</p><p>The one point I wanted to make is that the frequency space is divided linearly but if you remember from psychoacoustics, we actually hear a pitch not linearly but logarithmically and so a lot of linear frequency bins are kind of wasted if you will on things very high up in frequency space. So half of the bins are for what we would hear as just the final octave of our frequency space so this isnt a great match either, but thats how this particular algorithm works.</p><p>Wow, youre still here. Its the 100th day. You deserve some accolades for hanging in here till the end. I hope you found the journey from day 001 to day 100 informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823895166/Tb-KkNq0Z.png<![CDATA[100 Days Of ML Code — Day 099]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-099-db5c47620ea4https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-099-db5c47620ea4Fri, 26 Oct 2018 15:26:07 GMT<![CDATA[<p>100 Days Of ML Code Day 099</p><h2 id="recap-from-day-098"><strong>Recap from day 098</strong></h2><p>In the past two days, weve talked briefly about how to calculate the storage space of digital audio data based on decisions weve made about bit width, the number of channels, and sampling rate. Weve talked about ways to reduce that storage space through lossless file formats and lossy file formats and the implications of each.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-098-bdc089bb9e30"><strong>100 Days Of ML Code Day 098</strong><em>Recap from day 097</em>medium.com</a></p><p>Today, were going to move onto something a little bit different but actually goes back to some of what we were talking about when we looked at timbre which is essentially how we create these frequency representations of sound that were looking at when were talking about timbre, the sonogram and the spectral view.</p><h2 id="frequency-domain-analysis"><strong>Frequency Domain Analysis</strong></h2><p>I want to cover a somewhat complex topic, but I think its really important for us to understand it which is how we get from the waveform representation of digital audio and where we have time on our x-axis and amplitude on our y-axis to the sonogram representation where we can see much more information about the frequency and the timbre content of the sound.</p><p>Were going to talk about how we get away from the sonogram and the role the Fourier Theorem plays in that. Were going to talk about how we kind of work around the limitations of the Fourier Theorem through a process of windowing Periodicization and fast forwarding transform in order to take any sound that we might want to look at and represent it as a sum of a series of sound waves.</p><p>Well talk about some implications of this algorithm in terms of particularly two parameters of the frame size and bin width but we need to think about very carefully as were configuring it because they have some serious implications in terms of what we get are zeroes.</p><h2 id="from-waveform-to-sonogram"><strong>From Waveform to Sonogram</strong></h2><p>Its pretty obvious now that we know how sound is represented digitally on a computer. Its pretty obvious how a waveform representation like the one seen in the image below comes about. You know, we simply take the successive amplitude values, and we kind of plot them over time on the x-axis and then we have our waveform, we can connect the dots if we want to make it look a little nicer.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*HfRJtjGuTicXdCZl.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823907360/GniZFsp_I.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwjy1dqxtKTeAhXNz4UKHdyxBLkQjRx6BAgBEAU&url=https%3A%2F%2Fpixabay.com%2Fen%2Fphotos%2Fwave%2F%3Fcat%3Dmusic&psig=AOvVaw0kxSESJAWnOj7YVwmBAWbX&ust=1540653674856701">Source</a></em></p><p>But how we get from the kind of representation above to the one seen below is not obvious because when we represent sound digitally were encoding a series of amplitude values over time were not including any information about the frequency at all. So thats why we need to think about this a little bit more carefully and think about how we get to the representation seen below.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*DC8KmkUaHNQUjzbN" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823908981/fZ7M5XMD3.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwj_w7XStKTeAhWKzYUKHZrKAQUQjRx6BAgBEAU&url=https%3A%2F%2Fwww.nps.gov%2Fsubjects%2Fsound%2Fsounds-uau.htm&psig=AOvVaw20HqUxppZimfST1uFBbxW3&ust=1540653752911696">Source</a></em></p><p>So were going to revisit the Fourier Theorem which we looked at in the timbre article. I want to look at it in a little bit more depth now.</p><p>Just to recap we said the Fourier Theorem said that any periodic waveform can be represented as a sum of sine waves at frequencies that are integer multiples of a fundamental frequency and we looked at examples of this with a sawtooth wave and we looked at examples of the trombone sound of how we could kind of combine sine waves together.</p><p>I mean we wouldnt hear them anymore as individual sine waves, but wed hear them kind of coming together come possibly to create this single sound for us because of this special relationship they had to each other in terms of being integer multiples in a base frequency and because of the way that they were linked.</p><p>I also mentioned a really important limitation here. The periodic limitation. It only works for periodic waveforms like a perfect sine wave or a perfect square wave or something like that and that isnt how sounds work in the real world. Theyre not perfectly periodic. They dont repeat a cycle infinitely over and over and over again without any variation.</p><p>So, that one problem is that weve gotten this spectra aspect of timbre but not the envelope of timbre, not the changing in time, aspect of it. The other problem is that when we say that the sum of sine waves, theres an important caveat. Its a potentially infinite number of sine waves may be required to do the summation and computers dont tend to like infinity very much. Theyre not continuous beings. Theyre discrete; they do things as sets of zeros and ones.</p><p>So if we need potentially infinite number of sine waves to do the summation of sine waves, thats also going to be really problematic for us and so what we do instead is, we use this basic idea of the Fourier Theorem but we tweak it a little bit, we kind of fake it out if you will pretend that were working periodic waves and we do process, it doesnt do things perfectly but doesnt use an infinite number of sine waves either to make the summation happen and so there are three stages to the process that Im going to talk about in detail.</p><h2 id="the-process"><strong>The Process</strong></h2><p>Windowing is when we take a waveform and split it up into tiny little bits. Then, we take each of those tiny little bits and we do this thing called Periodicization. Theres really nothing to this we just pretend that little bit repeats infinitely so that its a periodic sample and then on each of those little windows we apply a method called the Fast Fourier Transform which youll often see abbreviated as FFT and so we apply this process in order to convert our time domain set of amplitudes values into information about frequency.</p><p>So, Im going to go through each of those steps in more detail tomorrow. Thats all for day 099. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 099</p><h2 id="recap-from-day-098"><strong>Recap from day 098</strong></h2><p>In the past two days, weve talked briefly about how to calculate the storage space of digital audio data based on decisions weve made about bit width, the number of channels, and sampling rate. Weve talked about ways to reduce that storage space through lossless file formats and lossy file formats and the implications of each.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-098-bdc089bb9e30"><strong>100 Days Of ML Code Day 098</strong><em>Recap from day 097</em>medium.com</a></p><p>Today, were going to move onto something a little bit different but actually goes back to some of what we were talking about when we looked at timbre which is essentially how we create these frequency representations of sound that were looking at when were talking about timbre, the sonogram and the spectral view.</p><h2 id="frequency-domain-analysis"><strong>Frequency Domain Analysis</strong></h2><p>I want to cover a somewhat complex topic, but I think its really important for us to understand it which is how we get from the waveform representation of digital audio and where we have time on our x-axis and amplitude on our y-axis to the sonogram representation where we can see much more information about the frequency and the timbre content of the sound.</p><p>Were going to talk about how we get away from the sonogram and the role the Fourier Theorem plays in that. Were going to talk about how we kind of work around the limitations of the Fourier Theorem through a process of windowing Periodicization and fast forwarding transform in order to take any sound that we might want to look at and represent it as a sum of a series of sound waves.</p><p>Well talk about some implications of this algorithm in terms of particularly two parameters of the frame size and bin width but we need to think about very carefully as were configuring it because they have some serious implications in terms of what we get are zeroes.</p><h2 id="from-waveform-to-sonogram"><strong>From Waveform to Sonogram</strong></h2><p>Its pretty obvious now that we know how sound is represented digitally on a computer. Its pretty obvious how a waveform representation like the one seen in the image below comes about. You know, we simply take the successive amplitude values, and we kind of plot them over time on the x-axis and then we have our waveform, we can connect the dots if we want to make it look a little nicer.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*HfRJtjGuTicXdCZl.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823907360/GniZFsp_I.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwjy1dqxtKTeAhXNz4UKHdyxBLkQjRx6BAgBEAU&url=https%3A%2F%2Fpixabay.com%2Fen%2Fphotos%2Fwave%2F%3Fcat%3Dmusic&psig=AOvVaw0kxSESJAWnOj7YVwmBAWbX&ust=1540653674856701">Source</a></em></p><p>But how we get from the kind of representation above to the one seen below is not obvious because when we represent sound digitally were encoding a series of amplitude values over time were not including any information about the frequency at all. So thats why we need to think about this a little bit more carefully and think about how we get to the representation seen below.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*DC8KmkUaHNQUjzbN" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823908981/fZ7M5XMD3.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwj_w7XStKTeAhWKzYUKHZrKAQUQjRx6BAgBEAU&url=https%3A%2F%2Fwww.nps.gov%2Fsubjects%2Fsound%2Fsounds-uau.htm&psig=AOvVaw20HqUxppZimfST1uFBbxW3&ust=1540653752911696">Source</a></em></p><p>So were going to revisit the Fourier Theorem which we looked at in the timbre article. I want to look at it in a little bit more depth now.</p><p>Just to recap we said the Fourier Theorem said that any periodic waveform can be represented as a sum of sine waves at frequencies that are integer multiples of a fundamental frequency and we looked at examples of this with a sawtooth wave and we looked at examples of the trombone sound of how we could kind of combine sine waves together.</p><p>I mean we wouldnt hear them anymore as individual sine waves, but wed hear them kind of coming together come possibly to create this single sound for us because of this special relationship they had to each other in terms of being integer multiples in a base frequency and because of the way that they were linked.</p><p>I also mentioned a really important limitation here. The periodic limitation. It only works for periodic waveforms like a perfect sine wave or a perfect square wave or something like that and that isnt how sounds work in the real world. Theyre not perfectly periodic. They dont repeat a cycle infinitely over and over and over again without any variation.</p><p>So, that one problem is that weve gotten this spectra aspect of timbre but not the envelope of timbre, not the changing in time, aspect of it. The other problem is that when we say that the sum of sine waves, theres an important caveat. Its a potentially infinite number of sine waves may be required to do the summation and computers dont tend to like infinity very much. Theyre not continuous beings. Theyre discrete; they do things as sets of zeros and ones.</p><p>So if we need potentially infinite number of sine waves to do the summation of sine waves, thats also going to be really problematic for us and so what we do instead is, we use this basic idea of the Fourier Theorem but we tweak it a little bit, we kind of fake it out if you will pretend that were working periodic waves and we do process, it doesnt do things perfectly but doesnt use an infinite number of sine waves either to make the summation happen and so there are three stages to the process that Im going to talk about in detail.</p><h2 id="the-process"><strong>The Process</strong></h2><p>Windowing is when we take a waveform and split it up into tiny little bits. Then, we take each of those tiny little bits and we do this thing called Periodicization. Theres really nothing to this we just pretend that little bit repeats infinitely so that its a periodic sample and then on each of those little windows we apply a method called the Fast Fourier Transform which youll often see abbreviated as FFT and so we apply this process in order to convert our time domain set of amplitudes values into information about frequency.</p><p>So, Im going to go through each of those steps in more detail tomorrow. Thats all for day 099. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823910929/N3SRrPCJr.jpeg<![CDATA[100 Days Of ML Code — Day 098]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-098-bdc089bb9e30https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-098-bdc089bb9e30Thu, 25 Oct 2018 15:54:10 GMT<![CDATA[<p>100 Days Of ML Code Day 098</p><h2 id="recap-from-day-097"><strong>Recap from day 097</strong></h2><p>In day 097 we looked at how we actually take data and store it. How much space it takes up on our disk and different file formats that are available to us to manage that.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-097-d8ed558ce652"><strong>100 Days Of ML Code Day 097</strong><em>Recap from day 096</em>medium.com</a></p><p>Today we will continue from where we left off in day 097</p><h2 id="digital-audio-storage-continued"><strong>Digital Audio Storage Continued</strong></h2><p>There are other lossless compression formats that youll encounter from time to time. ALAC is Apples. Its called Apple Lossless Audio Codec but its not very well supported by many other programs that arent made by Apple or dont use Apples APIs for Audio.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*SkbXjpyXEn1k73iw.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823917625/XwG9hJIee.html)" /><em><a target="_blank" href="https://www.lifewire.com/free-audio-converter-software-programs-2622863">Source</a></em></p><p>Theyre tools for importing and exporting audio files but this is something thats available If you need to get that two to one saving in size because youre trying to email a file to someone or share it somewhere or whatever might be but you want to keep all those amplitude values perfectly intact, this can be a good technique.</p><p>What people usually want to do, wind up doing when they want to save space is they use a lossy file format. lossy file format will compress the file size in a way that you can never get the original back but it does it using a perceptual encoding strategy. In other words, it actually considers how we hear sound, psychoacoustic.</p><p>Its just like we were talking about earlier in this module and it thinks about what are the things were not going to miss so much in the sound. What are some frequencies that we cant hear that well or particularly ones that might get kind of hidden or covered up by other audio content thats in the sound.</p><p>They try to use that to make intelligent decisions about what to leave out and what to keep in and youve all heard Im sure of some popular file formats in the lossy category. Mp3 is the most popular, AAC is fairly popular as well, Ogg Vorbis is another one thats used quite a bit.</p><p>Many others as well, the ones listed above are three of the most popular ones and they usually get you about a 90% savings over the original. So, instead of 10 MB per minutes of CD-quality sound, 44,100 Hz, 16-bit stereo, you usually get about 1 MB per minute, depending on the exact savings.</p><p>So, thats a substantial saving particularly useful in a lot of scenarios in terms of how we consume music today. If you are on your cell phone and youre trying to stream music tracks from a music provider, you cant stream a WAVE file on your crappy 3G connection or whatever connection you have available or you might not want to use your data plan up for all that streaming.</p><p>So, you can use a lossy file format and heres something thats pretty good over your cell phone but takes up only you know, saves you 90% of your data. So it can be useful in a lot of situations like that.</p><p>I do want to issue a very important warning here, its a lossy format for a reason, you can never get the original back. And so if youre making your own music it would be a horrible idea to only save that in a lossy format like an MP3 or an AAC or Ogg Vorbis or something like that.</p><p>Lets say you then later want them go back and edit it or make some changes, or re-encode in another format, well, youd be doing all that for a version that has lost some of the amplitude data of the original, those amplitude values are not going to be the same as when you created them, recorded them and so is never going to sound quite as good as the original version that you created in a lossless format.</p><p>And if you then go and try to re-encode it in a lossy format again, this is a very common thing to see. Take an MP3, decompress it into a WAVE form, do some editing on it and save it as an MP3 again. Well, weve basically done two different MP3 compressions, the first one, when I save that the first time and the second after Ive decoded it and edited it and Im saving it again.</p><p>Thats going to compound the effects of the losses when I do that. So its always a good idea when youre editing files, when youre working with them, when youre saving them, your own music for archival purposes, save it in the lossless format like a WAVE or an AIFF file or even like a FLAC, Free Lossless Audio Codec. Something thats going to help you get back the full quality of the original if you ever want to edit it or re-encode it again in the future.</p><p>Thats all for day 098. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 098</p><h2 id="recap-from-day-097"><strong>Recap from day 097</strong></h2><p>In day 097 we looked at how we actually take data and store it. How much space it takes up on our disk and different file formats that are available to us to manage that.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-097-d8ed558ce652"><strong>100 Days Of ML Code Day 097</strong><em>Recap from day 096</em>medium.com</a></p><p>Today we will continue from where we left off in day 097</p><h2 id="digital-audio-storage-continued"><strong>Digital Audio Storage Continued</strong></h2><p>There are other lossless compression formats that youll encounter from time to time. ALAC is Apples. Its called Apple Lossless Audio Codec but its not very well supported by many other programs that arent made by Apple or dont use Apples APIs for Audio.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*SkbXjpyXEn1k73iw.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823917625/XwG9hJIee.html)" /><em><a target="_blank" href="https://www.lifewire.com/free-audio-converter-software-programs-2622863">Source</a></em></p><p>Theyre tools for importing and exporting audio files but this is something thats available If you need to get that two to one saving in size because youre trying to email a file to someone or share it somewhere or whatever might be but you want to keep all those amplitude values perfectly intact, this can be a good technique.</p><p>What people usually want to do, wind up doing when they want to save space is they use a lossy file format. lossy file format will compress the file size in a way that you can never get the original back but it does it using a perceptual encoding strategy. In other words, it actually considers how we hear sound, psychoacoustic.</p><p>Its just like we were talking about earlier in this module and it thinks about what are the things were not going to miss so much in the sound. What are some frequencies that we cant hear that well or particularly ones that might get kind of hidden or covered up by other audio content thats in the sound.</p><p>They try to use that to make intelligent decisions about what to leave out and what to keep in and youve all heard Im sure of some popular file formats in the lossy category. Mp3 is the most popular, AAC is fairly popular as well, Ogg Vorbis is another one thats used quite a bit.</p><p>Many others as well, the ones listed above are three of the most popular ones and they usually get you about a 90% savings over the original. So, instead of 10 MB per minutes of CD-quality sound, 44,100 Hz, 16-bit stereo, you usually get about 1 MB per minute, depending on the exact savings.</p><p>So, thats a substantial saving particularly useful in a lot of scenarios in terms of how we consume music today. If you are on your cell phone and youre trying to stream music tracks from a music provider, you cant stream a WAVE file on your crappy 3G connection or whatever connection you have available or you might not want to use your data plan up for all that streaming.</p><p>So, you can use a lossy file format and heres something thats pretty good over your cell phone but takes up only you know, saves you 90% of your data. So it can be useful in a lot of situations like that.</p><p>I do want to issue a very important warning here, its a lossy format for a reason, you can never get the original back. And so if youre making your own music it would be a horrible idea to only save that in a lossy format like an MP3 or an AAC or Ogg Vorbis or something like that.</p><p>Lets say you then later want them go back and edit it or make some changes, or re-encode in another format, well, youd be doing all that for a version that has lost some of the amplitude data of the original, those amplitude values are not going to be the same as when you created them, recorded them and so is never going to sound quite as good as the original version that you created in a lossless format.</p><p>And if you then go and try to re-encode it in a lossy format again, this is a very common thing to see. Take an MP3, decompress it into a WAVE form, do some editing on it and save it as an MP3 again. Well, weve basically done two different MP3 compressions, the first one, when I save that the first time and the second after Ive decoded it and edited it and Im saving it again.</p><p>Thats going to compound the effects of the losses when I do that. So its always a good idea when youre editing files, when youre working with them, when youre saving them, your own music for archival purposes, save it in the lossless format like a WAVE or an AIFF file or even like a FLAC, Free Lossless Audio Codec. Something thats going to help you get back the full quality of the original if you ever want to edit it or re-encode it again in the future.</p><p>Thats all for day 098. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823919663/vyrjOQgRd.png<![CDATA[100 Days Of ML Code — Day 097]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-097-d8ed558ce652https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-097-d8ed558ce652Wed, 24 Oct 2018 15:46:35 GMT<![CDATA[<p>100 Days Of ML Code Day 097</p><h2 id="recap-from-day-096"><strong>Recap from day 096</strong></h2><p>In day 095 and 096 we talked about the way that we hear sound in space: interaural delay time, head related transfer function and we also talked about binaural recording and processing, which are very effective if we are just working with headphones. And then we talked about different speaker configurations available to us for a diffusion of sound and space through speakers.</p><p>You can catch up using the links below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-096-119d5a932fa3"><strong>100 Days Of ML Code Day 096</strong><em>Recap from day 095</em>medium.com</a><a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-095-6cd4367ee920"><strong>100 Days Of ML Code Day 095</strong><em>//</em>medium.com</a></p><p>Today, well talk about the question of how we actually take data and store it. How much space it takes up on our disk and different file formats that are available to us to manage that.</p><h2 id="digital-audio-storage"><strong>Digital Audio Storage</strong></h2><p>So, weve decided on our sampling rate and our bit-width, how many channels we need to represent audio for a particular situation. Now, how do we figure out how much space it takes up? thats what were going to cover today.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*8wxzEEjq8uLCLnIO.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823952844/olPw6RhPT.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiCmJepr5_eAhVwx4UKHcHHBdsQjRx6BAgBEAU&url=https%3A%2F%2Fwww.lifewire.com%2Fbest-free-mp3-tools-for-converting-your-music-files-2438474&psig=AOvVaw0Yr-FpslOPefZyIQo4Xahg&ust=1540480487257274">Source</a></em></p><p>Were going to look at how to calculate storage space and then were going to look at different file formats to use of lossless file formats that preserve all of our amplitude values perfectly and lossy file formats that can save us a lot of disk space but loosen data in the process.</p><p>So, before we look through the file formats lets just go through some very simple calculations here. lets assume that we had 1 minute of audio, at 16-bits, 44,100 Hz and stereo, so two channels. How much disc space would this actually take up to store?</p><p>So weve got, 60 seconds, multiplied by 44,000 100 samples per second multiplied by16 bits, per sample multiplied by two channels, two samples, per moment in time. This comes out to about 84 million Bits.</p><p>Now before you start freaking out thats bits, thats not usually how we talk about digital data. So, if we convert 84 million to bytes we divide by 8 and then were going to go to kilobytes we would divide by 1024 and then if we wanted to go to megabytes, wed divide by 1,024 again. And that number is going to end up coming out to be about 10 MB.</p><p>So, in order to store 1 minute of 16-bit 44100 Hz stereo sound, we need about 10 MB of disk space. So, how are we going to store this? Let's assume weve got plenty of space to store, thats not an issue, we just want to store it on this.</p><p>The, easiest thing we can do, is just to, use a, a standard file format. Basically, it takes all the amplitude values, all the binary digits and just kind of plots them onto disk in a structured format. The two most popular formats for doing that these days are WAVE files and AIFF files.</p><p>There was a time long ago when WAVE was the Windows format and AIFF was the Apple format. In any music technology program, wed be encountering these days, they would both support, they would all support both formats just as well. Theres a lot of more obscure formats that arent used nearly as much.</p><p>WAVE files and AIFF files are supported by just about every audio program out there. If we wanted to save some space, we could try to compress this data. And we could use a lossless compression format.</p><p>What a lossless compression format would do is something similar to what like a ZIP archive would do for other types of files. It would go through and it would try to re-encode all our amplitude values in a way that represents the most commonly used ones a little bit more efficiently, at the expense of representing some of the less frequently used ones less efficiently.</p><p>Using a technique like this, we could save usually about 50%. So, instead of our 10 MB per minute of CD-quality sound, wed have about 5 MB to represent that same minute and the most popular format here is, is FLAC, that stands for Free Lossless Audio Codec.</p><p>Thats all for day 097. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 097</p><h2 id="recap-from-day-096"><strong>Recap from day 096</strong></h2><p>In day 095 and 096 we talked about the way that we hear sound in space: interaural delay time, head related transfer function and we also talked about binaural recording and processing, which are very effective if we are just working with headphones. And then we talked about different speaker configurations available to us for a diffusion of sound and space through speakers.</p><p>You can catch up using the links below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-096-119d5a932fa3"><strong>100 Days Of ML Code Day 096</strong><em>Recap from day 095</em>medium.com</a><a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-095-6cd4367ee920"><strong>100 Days Of ML Code Day 095</strong><em>//</em>medium.com</a></p><p>Today, well talk about the question of how we actually take data and store it. How much space it takes up on our disk and different file formats that are available to us to manage that.</p><h2 id="digital-audio-storage"><strong>Digital Audio Storage</strong></h2><p>So, weve decided on our sampling rate and our bit-width, how many channels we need to represent audio for a particular situation. Now, how do we figure out how much space it takes up? thats what were going to cover today.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*8wxzEEjq8uLCLnIO.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823952844/olPw6RhPT.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiCmJepr5_eAhVwx4UKHcHHBdsQjRx6BAgBEAU&url=https%3A%2F%2Fwww.lifewire.com%2Fbest-free-mp3-tools-for-converting-your-music-files-2438474&psig=AOvVaw0Yr-FpslOPefZyIQo4Xahg&ust=1540480487257274">Source</a></em></p><p>Were going to look at how to calculate storage space and then were going to look at different file formats to use of lossless file formats that preserve all of our amplitude values perfectly and lossy file formats that can save us a lot of disk space but loosen data in the process.</p><p>So, before we look through the file formats lets just go through some very simple calculations here. lets assume that we had 1 minute of audio, at 16-bits, 44,100 Hz and stereo, so two channels. How much disc space would this actually take up to store?</p><p>So weve got, 60 seconds, multiplied by 44,000 100 samples per second multiplied by16 bits, per sample multiplied by two channels, two samples, per moment in time. This comes out to about 84 million Bits.</p><p>Now before you start freaking out thats bits, thats not usually how we talk about digital data. So, if we convert 84 million to bytes we divide by 8 and then were going to go to kilobytes we would divide by 1024 and then if we wanted to go to megabytes, wed divide by 1,024 again. And that number is going to end up coming out to be about 10 MB.</p><p>So, in order to store 1 minute of 16-bit 44100 Hz stereo sound, we need about 10 MB of disk space. So, how are we going to store this? Let's assume weve got plenty of space to store, thats not an issue, we just want to store it on this.</p><p>The, easiest thing we can do, is just to, use a, a standard file format. Basically, it takes all the amplitude values, all the binary digits and just kind of plots them onto disk in a structured format. The two most popular formats for doing that these days are WAVE files and AIFF files.</p><p>There was a time long ago when WAVE was the Windows format and AIFF was the Apple format. In any music technology program, wed be encountering these days, they would both support, they would all support both formats just as well. Theres a lot of more obscure formats that arent used nearly as much.</p><p>WAVE files and AIFF files are supported by just about every audio program out there. If we wanted to save some space, we could try to compress this data. And we could use a lossless compression format.</p><p>What a lossless compression format would do is something similar to what like a ZIP archive would do for other types of files. It would go through and it would try to re-encode all our amplitude values in a way that represents the most commonly used ones a little bit more efficiently, at the expense of representing some of the less frequently used ones less efficiently.</p><p>Using a technique like this, we could save usually about 50%. So, instead of our 10 MB per minute of CD-quality sound, wed have about 5 MB to represent that same minute and the most popular format here is, is FLAC, that stands for Free Lossless Audio Codec.</p><p>Thats all for day 097. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823955084/4_yyptNWqk.png<![CDATA[100 Days Of ML Code — Day 096]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-096-119d5a932fa3https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-096-119d5a932fa3Tue, 23 Oct 2018 21:16:06 GMT<![CDATA[<p>100 Days Of ML Code Day 096</p><h2 id="recap-from-day-095"><strong>Recap from day 095</strong></h2><p>In day 095 we talked about HRTF (head-related transfer function) and closed with what binaural sound essentially is.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-095-6cd4367ee920"><strong>100 Days Of ML Code Day 095</strong><em>//</em>medium.com</a></p><p>Today, well look deeper into binaural sound.</p><h2 id="binaural-sound"><strong>Binaural Sound</strong></h2><p>A binaural sound is essentially spatialized sound designed explicitly to be heard over headphones. There are two different ways that we can we can create binaural sound. One is that we can use a special microphone called a binaural microphone. Below is a picture of one.</p><p><img src="https://cdn-images-1.medium.com/max/2000/1*ROYdsu5RtgMtt2DCVdr9DQ.jpeg" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823935295/4AHyMtfkS.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiAx7_6up3eAhWSDuwKHcAQCgIQjRx6BAgBEAU&url=https%3A%2F%2Fwww.studiocare.com%2Fbinaural-enthusiast-b1-e-dummy-head-with-be-p1-binaural-microphones-battery-box.html&psig=AOvVaw3zKEmW0GkbDzPtXLDG8Q_y&ust=1540414836814434">Source</a></em></p><p>The microphone as seen above essentially looks like a pretend human head with pretend ears sticking on the side and then what actually happens is there is a little microphone embedded inside of that pretend ear, and so we can put the microphone in a place when were recording and thats going to mimic the inner aural delay time because those microphones are placed apart from each other somewhat according to the placement of our ears in our heads and also, the material that its made out of is going to mimic that head-related transfer function of the sound passing through our heads and our ears.</p><p>Say we play back the sound over headphones and we send what was recorded in the left ear to our left ear same as was recorded in the right ear to our right ear wed get a very good sense of how the sounds were in space at the time that we were recording and what was in front of us, to our left and right, behind us and so on and so forth.</p><p>The other thing that we can do of course is we can try to simulate those effects digitally through applying artificially the phase difference in the interval delay time and some filtering, changing the frequency response to mimic that head-related transfer function to simulate sound coming from a particular location of space.</p><p>Binaural sound is great if youre listening over headphones but it doesnt work so well if were listening over speakers because when were listening over speakers If we have, say stereo speakers we dont have the luxury of having only the left channel go to our left ear and only the right channel going to our right ear.</p><p>We dont get that isolation. Theyre both going to go to both of our ears so we cant really simulate the interaural delay time or head-transfer function so well over speakers, where both channels are going to both ears.</p><p>So if we can get some limited sense of spatialization with stereo left and right but if we want to get more serious, we need to get more speakers involved and you all have probably heard of a 5.1, Surround Sound, for instance. Thats the basic idea that we have.</p><p>We have in the front, we would have a left, centre and right, and then in the back, wed have a left and a right, thats how we get our five channels of sound. This is what is used in movie theatres, thats whats used in home theatre systems as well most of the time to try to give a basic sense of sound moving through space.</p><p>If we have those five channels, we can do that. We can obviously have more channels, too. its fairly common especially in kind of the academic world of computer music to have eight or ten or even more channels of surround where there are more and more speakers around the space to be able to simulate space a little bit more precisely. And when we do this we have special software that can usually help us to figure out how much of each sound we want to send to each speaker.</p><p>Thats all for day 096. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 096</p><h2 id="recap-from-day-095"><strong>Recap from day 095</strong></h2><p>In day 095 we talked about HRTF (head-related transfer function) and closed with what binaural sound essentially is.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-095-6cd4367ee920"><strong>100 Days Of ML Code Day 095</strong><em>//</em>medium.com</a></p><p>Today, well look deeper into binaural sound.</p><h2 id="binaural-sound"><strong>Binaural Sound</strong></h2><p>A binaural sound is essentially spatialized sound designed explicitly to be heard over headphones. There are two different ways that we can we can create binaural sound. One is that we can use a special microphone called a binaural microphone. Below is a picture of one.</p><p><img src="https://cdn-images-1.medium.com/max/2000/1*ROYdsu5RtgMtt2DCVdr9DQ.jpeg" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823935295/4AHyMtfkS.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiAx7_6up3eAhWSDuwKHcAQCgIQjRx6BAgBEAU&url=https%3A%2F%2Fwww.studiocare.com%2Fbinaural-enthusiast-b1-e-dummy-head-with-be-p1-binaural-microphones-battery-box.html&psig=AOvVaw3zKEmW0GkbDzPtXLDG8Q_y&ust=1540414836814434">Source</a></em></p><p>The microphone as seen above essentially looks like a pretend human head with pretend ears sticking on the side and then what actually happens is there is a little microphone embedded inside of that pretend ear, and so we can put the microphone in a place when were recording and thats going to mimic the inner aural delay time because those microphones are placed apart from each other somewhat according to the placement of our ears in our heads and also, the material that its made out of is going to mimic that head-related transfer function of the sound passing through our heads and our ears.</p><p>Say we play back the sound over headphones and we send what was recorded in the left ear to our left ear same as was recorded in the right ear to our right ear wed get a very good sense of how the sounds were in space at the time that we were recording and what was in front of us, to our left and right, behind us and so on and so forth.</p><p>The other thing that we can do of course is we can try to simulate those effects digitally through applying artificially the phase difference in the interval delay time and some filtering, changing the frequency response to mimic that head-related transfer function to simulate sound coming from a particular location of space.</p><p>Binaural sound is great if youre listening over headphones but it doesnt work so well if were listening over speakers because when were listening over speakers If we have, say stereo speakers we dont have the luxury of having only the left channel go to our left ear and only the right channel going to our right ear.</p><p>We dont get that isolation. Theyre both going to go to both of our ears so we cant really simulate the interaural delay time or head-transfer function so well over speakers, where both channels are going to both ears.</p><p>So if we can get some limited sense of spatialization with stereo left and right but if we want to get more serious, we need to get more speakers involved and you all have probably heard of a 5.1, Surround Sound, for instance. Thats the basic idea that we have.</p><p>We have in the front, we would have a left, centre and right, and then in the back, wed have a left and a right, thats how we get our five channels of sound. This is what is used in movie theatres, thats whats used in home theatre systems as well most of the time to try to give a basic sense of sound moving through space.</p><p>If we have those five channels, we can do that. We can obviously have more channels, too. its fairly common especially in kind of the academic world of computer music to have eight or ten or even more channels of surround where there are more and more speakers around the space to be able to simulate space a little bit more precisely. And when we do this we have special software that can usually help us to figure out how much of each sound we want to send to each speaker.</p><p>Thats all for day 096. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823937154/__X8wIk1N.jpeg<![CDATA[100 Days Of ML Code — Day 095]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-095-6cd4367ee920https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-095-6cd4367ee920Mon, 22 Oct 2018 21:20:58 GMT<![CDATA[<p>100 Days Of ML Code Day 095</p><h2 id="recap-from-day-094"><strong>Recap from day 094</strong></h2><p>In day 094 we addressed the question of how many channels we need to record sound in different scenarios to represent the location of sound in space. We learned that interaural delay time, IDT is the difference in the time delay between when the sound gets to your right ear and when it gets to your left ear.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-094-161921734728"><strong>100 Days Of ML Code Day 094</strong><em>Recap from day 093</em>medium.com</a></p><p>Today, we are going to continue from where we left off in day 094</p><h2 id="hrtf-head-related-transfer-function"><strong>HRTF (head-related transfer function)</strong></h2><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823926402/1L6gL-jdxX.png" alt /></p><p>HRTF (head-related transfer function) simply means that if we look at the yellow line going to the right ear in the image above well see that for sound to get to your right ear its actually travelling through your head. So, sound depending on where it's coming from might have to get through your head to get to your ear.</p><p>It might have to get through parts of your ear. Your outer ear, it might have to go through all different parts of your head in order to get to your ear and actually be heard and as it travels through your head thats different from it travelling through air. In particular, some of the higher frequency components of that sound are going to lose some more energy and so its going to sound different by the time it gets to your ear because it had to travel through your head or through your outer ear through all these different places.</p><p>Again, we kind of automatically pick up on these queues and this helps us to figure out where a sound is coming from. So, interaural delay time and head-related transfer function are really powerful ques for us to get a sense of where a sound is coming from in space. So we can take advantage of these if were listening on headphones, to do something called a binaural sound.</p><p>A binaural sound is essentially spatialized sound designed explicitly to be heard over headphones. Well see binaural sound in details in day 095.</p><p>Thats all for day 095. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 095</p><h2 id="recap-from-day-094"><strong>Recap from day 094</strong></h2><p>In day 094 we addressed the question of how many channels we need to record sound in different scenarios to represent the location of sound in space. We learned that interaural delay time, IDT is the difference in the time delay between when the sound gets to your right ear and when it gets to your left ear.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-094-161921734728"><strong>100 Days Of ML Code Day 094</strong><em>Recap from day 093</em>medium.com</a></p><p>Today, we are going to continue from where we left off in day 094</p><h2 id="hrtf-head-related-transfer-function"><strong>HRTF (head-related transfer function)</strong></h2><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823926402/1L6gL-jdxX.png" alt /></p><p>HRTF (head-related transfer function) simply means that if we look at the yellow line going to the right ear in the image above well see that for sound to get to your right ear its actually travelling through your head. So, sound depending on where it's coming from might have to get through your head to get to your ear.</p><p>It might have to get through parts of your ear. Your outer ear, it might have to go through all different parts of your head in order to get to your ear and actually be heard and as it travels through your head thats different from it travelling through air. In particular, some of the higher frequency components of that sound are going to lose some more energy and so its going to sound different by the time it gets to your ear because it had to travel through your head or through your outer ear through all these different places.</p><p>Again, we kind of automatically pick up on these queues and this helps us to figure out where a sound is coming from. So, interaural delay time and head-related transfer function are really powerful ques for us to get a sense of where a sound is coming from in space. So we can take advantage of these if were listening on headphones, to do something called a binaural sound.</p><p>A binaural sound is essentially spatialized sound designed explicitly to be heard over headphones. Well see binaural sound in details in day 095.</p><p>Thats all for day 095. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823928921/Y85FvaU-Xw.png<![CDATA[100 Days Of ML Code — Day 094]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-094-161921734728https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-094-161921734728Sun, 21 Oct 2018 20:17:39 GMT<![CDATA[<p>100 Days Of ML Code Day 094</p><h2 id="recap-from-day-093"><strong>Recap from day 093</strong></h2><p>In day 093 weve talked about bit width, what it means in terms of binary digits. We also talked about its implications in terms of when we record sound, were going to take advantage of our bit width but not go beyond the binary digits that are available to us.</p><p>You can catch up using the link below<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-093-3e52b7a25cfe"><strong>100 Days Of ML Code Day 093</strong><em>In the last two days, we talked about sampling rate, and how to determine an appropriate sampling rate to represent</em>medium.com</a></p><p>Today, well address the question of how many channels we need to record sound in different scenarios to represent the location of a sound in space.</p><h2 id="channels-and-spatialization"><strong>Channels and spatialization</strong></h2><p>Were going to address the issue of channels and spatialization; essentially how many amplitude values do we need to record in each sample and time in order to set an amount to multiple ears of headphones or multiple speakers and to simulate the location of sounds in space.</p><p>As we talk about channels and spatialization, I first want to talk about two important phenomenon related to this Interaural Delay Time and Head-Related Transfer Function and then well talk about how these can combine to simulate the spatialization of sound through headphones, through a process called binaural sound. And then well talk about what we might need to do differently if were sending sound out over multiple speakers with sound diffusion.</p><h2 id="interaural-delay-time"><strong>Interaural Delay Time</strong></h2><p>First I want to talk about Interaural Delay Time and the Head-Related Transfer Function.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823944113/yfz93hUMO3.png" alt /></p><p>So lets pretend that the image above showing your head and then we have a sound from the speaker over there. As that sound is travelling to your ears, imagine, that theres a sound wave kind of going to your right ear, a sound wave going to your left ear.</p><p>Now, whats important here is that the length of those two yellow lines that I drew in the image a little different from each other. Its going to take longer for the sound to get to your right ear than to your left ear because it has to travel just a little bit further and so, theres going to be a difference in phase between those two sound waves as they reach your two ears.</p><p>When were listening to sounds in the real world, we can automatically kind of process that difference in phase and use that as a cue to understand where that sound is coming from. So, thats called interaural delay time, IDT that difference in the time delay between when sound gets to your right ear and when it gets to your left ear.</p><p>Thats all for day 094. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 094</p><h2 id="recap-from-day-093"><strong>Recap from day 093</strong></h2><p>In day 093 weve talked about bit width, what it means in terms of binary digits. We also talked about its implications in terms of when we record sound, were going to take advantage of our bit width but not go beyond the binary digits that are available to us.</p><p>You can catch up using the link below<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-093-3e52b7a25cfe"><strong>100 Days Of ML Code Day 093</strong><em>In the last two days, we talked about sampling rate, and how to determine an appropriate sampling rate to represent</em>medium.com</a></p><p>Today, well address the question of how many channels we need to record sound in different scenarios to represent the location of a sound in space.</p><h2 id="channels-and-spatialization"><strong>Channels and spatialization</strong></h2><p>Were going to address the issue of channels and spatialization; essentially how many amplitude values do we need to record in each sample and time in order to set an amount to multiple ears of headphones or multiple speakers and to simulate the location of sounds in space.</p><p>As we talk about channels and spatialization, I first want to talk about two important phenomenon related to this Interaural Delay Time and Head-Related Transfer Function and then well talk about how these can combine to simulate the spatialization of sound through headphones, through a process called binaural sound. And then well talk about what we might need to do differently if were sending sound out over multiple speakers with sound diffusion.</p><h2 id="interaural-delay-time"><strong>Interaural Delay Time</strong></h2><p>First I want to talk about Interaural Delay Time and the Head-Related Transfer Function.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823944113/yfz93hUMO3.png" alt /></p><p>So lets pretend that the image above showing your head and then we have a sound from the speaker over there. As that sound is travelling to your ears, imagine, that theres a sound wave kind of going to your right ear, a sound wave going to your left ear.</p><p>Now, whats important here is that the length of those two yellow lines that I drew in the image a little different from each other. Its going to take longer for the sound to get to your right ear than to your left ear because it has to travel just a little bit further and so, theres going to be a difference in phase between those two sound waves as they reach your two ears.</p><p>When were listening to sounds in the real world, we can automatically kind of process that difference in phase and use that as a cue to understand where that sound is coming from. So, thats called interaural delay time, IDT that difference in the time delay between when sound gets to your right ear and when it gets to your left ear.</p><p>Thats all for day 094. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823946220/ByaBJLRP_.png<![CDATA[100 Days Of ML Code — Day 093]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-093-3e52b7a25cfehttps://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-093-3e52b7a25cfeSat, 20 Oct 2018 20:59:16 GMT<![CDATA[<p>100 Days Of ML Code Day 093</p><h2 id="recap-from-day-091-and-092">Recap From Day 091 and 092</h2><p>In the last two days, we talked about sampling rate, and how to determine an appropriate sampling rate to represent audio digitally and that had to do with the x-axis, the time axis of our waveform representation.</p><p>You can catch up using the links below<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-092-f4efd1365748"><strong>100 Days Of ML Code Day 092</strong><em>//100 Days Of ML Code Day 092</em>medium.com</a><a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-091-ad227e5c9f0"><strong>100 Days Of ML Code Day 091</strong><em>Recap From Day 090</em>medium.com</a></p><p>Starting from today, were going to turn to the y-axis, the amplitude axis, and talk about how many binary digits. What our bit width needs to be to represent each amplitude sample that we record. Well talk more formally about what bit width is, and well review what binary numbers are, in case youre not familiar with them already. And then well also talk about some implications of bit width, in terms of how we record sound, and also how artists have used it in some interesting ways.</p><h2 id="bit-width"><strong>Bit width</strong></h2><p>Formally the bit width is the number of binary digits that we use to represent the amplitude of each sample.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823961473/M8hoOSXQKe.png" alt /></p><p>So for each of the dots in the image shown above, how many binary numbers are we using on the computer? How many zeros or one digits are we using to represent, what that amplitude value is? So it's important that we think about this in terms of binary numbers.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823963346/eJeeKJwyC.png" alt /></p><p>So if we had a bit width of one, for example, that would mean that we would use one binary digit and a binary digit can either be zero or it can be one. Its either on or its off. So we have two possibilities here, its either zero, or its one. So that means that our resolution is effectively two we have two options for how were going to represent the amplitude and that would obviously be an incredibly restrictive environment to work in.</p><p>So if we want two bits each one of them each of the two binary digits as seen in the table above could be either zero or one. So, two possibilities for the first digit two possibilities for the second digit. 2 times 2 is 4. still pretty limited. And when up to 8 bits which is something thats actually used in some fairly low-resolution recordings. I have 8 binary digits, 2 to the 8th power possibilities, 256 possible amplitude values. In other words, as were taking the negative one to positive one amplitude space, that y-axis over waveform, weve kind of limited it to 256 different possibilities evenly spread across that space.</p><p>16 bits which is what we use in CDs we have 2 to the 16th possibilities about 65,000 and then 24 bits which is what I like to use whenever possible we have two to the 24th possibilities on to 16 million. Also, those extra eight bits from 16 bits to 24 bits gets you a lot of extra resolution on your y-axis from 65,000 up to about 16,000,000. Some people recorded 32 bit as well with some high-end audio software and hardware.</p><p>So, obviously, we want to record with as much resolution as we can, within the limits of whatever media were working with. Obviously eventually you know using a CD were going to be limited to 16 bits when we finally code that file for a CD. But we cant use infinite amounts of disk space an issue well get to later in the future but, I also just want to talk about the implications of this for recording because its not enough to just record something at a good bit width using 16 bits or 24 bits or 32 bits or whatever.</p><p>Its very important that when you are recording you are trying to use the full dynamic range that is available to you because if you are recording at a high bit width but you are only using a tiny bit of the negative one, positive one amplitude range because you might be turning really low or whatever else, might be going on in your process youre wasting all these bits, theyre just never getting used for anything and so youre effectively recording at a much lower resolution.</p><p>But on the other hand, if you record too loud, I wouldnt use every single of those bits no matter what. Thats also a problem, because if you go over the negative ones, the positive one range well, then youve run out of binary digits to represent those amplitudes values and so they all just kind of clips are cut off at positive one or negative one, so you end up with something called digital distortion which is also not a good thing. Its basically the peaks and the troughs of all your waveforms just get kind of lopped off and that tends not to sound very good either.</p><p>Thats all for day 093. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 093</p><h2 id="recap-from-day-091-and-092">Recap From Day 091 and 092</h2><p>In the last two days, we talked about sampling rate, and how to determine an appropriate sampling rate to represent audio digitally and that had to do with the x-axis, the time axis of our waveform representation.</p><p>You can catch up using the links below<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-092-f4efd1365748"><strong>100 Days Of ML Code Day 092</strong><em>//100 Days Of ML Code Day 092</em>medium.com</a><a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-091-ad227e5c9f0"><strong>100 Days Of ML Code Day 091</strong><em>Recap From Day 090</em>medium.com</a></p><p>Starting from today, were going to turn to the y-axis, the amplitude axis, and talk about how many binary digits. What our bit width needs to be to represent each amplitude sample that we record. Well talk more formally about what bit width is, and well review what binary numbers are, in case youre not familiar with them already. And then well also talk about some implications of bit width, in terms of how we record sound, and also how artists have used it in some interesting ways.</p><h2 id="bit-width"><strong>Bit width</strong></h2><p>Formally the bit width is the number of binary digits that we use to represent the amplitude of each sample.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823961473/M8hoOSXQKe.png" alt /></p><p>So for each of the dots in the image shown above, how many binary numbers are we using on the computer? How many zeros or one digits are we using to represent, what that amplitude value is? So it's important that we think about this in terms of binary numbers.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823963346/eJeeKJwyC.png" alt /></p><p>So if we had a bit width of one, for example, that would mean that we would use one binary digit and a binary digit can either be zero or it can be one. Its either on or its off. So we have two possibilities here, its either zero, or its one. So that means that our resolution is effectively two we have two options for how were going to represent the amplitude and that would obviously be an incredibly restrictive environment to work in.</p><p>So if we want two bits each one of them each of the two binary digits as seen in the table above could be either zero or one. So, two possibilities for the first digit two possibilities for the second digit. 2 times 2 is 4. still pretty limited. And when up to 8 bits which is something thats actually used in some fairly low-resolution recordings. I have 8 binary digits, 2 to the 8th power possibilities, 256 possible amplitude values. In other words, as were taking the negative one to positive one amplitude space, that y-axis over waveform, weve kind of limited it to 256 different possibilities evenly spread across that space.</p><p>16 bits which is what we use in CDs we have 2 to the 16th possibilities about 65,000 and then 24 bits which is what I like to use whenever possible we have two to the 24th possibilities on to 16 million. Also, those extra eight bits from 16 bits to 24 bits gets you a lot of extra resolution on your y-axis from 65,000 up to about 16,000,000. Some people recorded 32 bit as well with some high-end audio software and hardware.</p><p>So, obviously, we want to record with as much resolution as we can, within the limits of whatever media were working with. Obviously eventually you know using a CD were going to be limited to 16 bits when we finally code that file for a CD. But we cant use infinite amounts of disk space an issue well get to later in the future but, I also just want to talk about the implications of this for recording because its not enough to just record something at a good bit width using 16 bits or 24 bits or 32 bits or whatever.</p><p>Its very important that when you are recording you are trying to use the full dynamic range that is available to you because if you are recording at a high bit width but you are only using a tiny bit of the negative one, positive one amplitude range because you might be turning really low or whatever else, might be going on in your process youre wasting all these bits, theyre just never getting used for anything and so youre effectively recording at a much lower resolution.</p><p>But on the other hand, if you record too loud, I wouldnt use every single of those bits no matter what. Thats also a problem, because if you go over the negative ones, the positive one range well, then youve run out of binary digits to represent those amplitudes values and so they all just kind of clips are cut off at positive one or negative one, so you end up with something called digital distortion which is also not a good thing. Its basically the peaks and the troughs of all your waveforms just get kind of lopped off and that tends not to sound very good either.</p><p>Thats all for day 093. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823965191/5IqBmjM_w.png<![CDATA[100 Days Of ML Code — Day 092]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-092-f4efd1365748https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-092-f4efd1365748Fri, 19 Oct 2018 14:35:22 GMT<![CDATA[<p>100 Days Of ML Code Day 092</p><h2 id="recap-from-day-091">Recap From Day 091</h2><p>Day 091, we looked at Nyquist Theorem</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-091-ad227e5c9f0"><strong>100 Days Of ML Code Day 091</strong><em>Recap From Day 090</em>medium.com</a></p><p>Today, Im going to talk briefly about what happens if our sampling rate is too low. We get something called foldover.</p><h2 id="foldover"><strong>Foldover</strong></h2><p>If our sampling rate is too low, its not just that the frequencies above the Nyquist frequency which is that highest frequency we can represent. Its not just that those frequencies disappear from our sound, but they actually turn into other frequencies in the sampling rates. So, I want to show you what I mean.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823972814/BQUFgiDiZ.png" alt /></p><p>In the image above, weve got a sine wave on the top and we can look at the number of cycles in it from peak to peak, peak to peak, peak to peak, peak to peak so theres four plus a little bit more in it. That image is the sampling rate of 44,100 Hertz.</p><p>If we take that same sine wave and we reduce it down to something crazy low, like, 284 hertz, we end up with something like what you see in the bottom in the image above. Were still getting a periodic sound here.</p><p>Its not a sine wave anymore because weve lost kind of resolution of that curve and its also not the same number of cycles anymore but we are getting cycles. Were getting one full cycle plus a little bit more in this particular square of time.</p><p>Were going to hear that as a periodic sound. Its going to have a frequency to it. But its not going to be the original frequency that we expected of that 440-hertz sine wave that we had in the top image.</p><p>So just to quickly review here. In the last two days, we talked about the Nyquist Theorem as a way to figure out an appropriate sampling rate, that our sampling rate needs to be at least double the highest frequency we want to represent and we talked about how we kind of arrived at 44,100 hertz as a fairly standard sampling rate.</p><p>We also talked about foldover and other effects that can happen when were recording at a sampling rate thats too low. In the next couple of days, were going to get into the question of bit-width and how we decide what resolution we need to represent the amplitude of each sample.</p><p>Thats all for day 092. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 092</p><h2 id="recap-from-day-091">Recap From Day 091</h2><p>Day 091, we looked at Nyquist Theorem</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-091-ad227e5c9f0"><strong>100 Days Of ML Code Day 091</strong><em>Recap From Day 090</em>medium.com</a></p><p>Today, Im going to talk briefly about what happens if our sampling rate is too low. We get something called foldover.</p><h2 id="foldover"><strong>Foldover</strong></h2><p>If our sampling rate is too low, its not just that the frequencies above the Nyquist frequency which is that highest frequency we can represent. Its not just that those frequencies disappear from our sound, but they actually turn into other frequencies in the sampling rates. So, I want to show you what I mean.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632823972814/BQUFgiDiZ.png" alt /></p><p>In the image above, weve got a sine wave on the top and we can look at the number of cycles in it from peak to peak, peak to peak, peak to peak, peak to peak so theres four plus a little bit more in it. That image is the sampling rate of 44,100 Hertz.</p><p>If we take that same sine wave and we reduce it down to something crazy low, like, 284 hertz, we end up with something like what you see in the bottom in the image above. Were still getting a periodic sound here.</p><p>Its not a sine wave anymore because weve lost kind of resolution of that curve and its also not the same number of cycles anymore but we are getting cycles. Were getting one full cycle plus a little bit more in this particular square of time.</p><p>Were going to hear that as a periodic sound. Its going to have a frequency to it. But its not going to be the original frequency that we expected of that 440-hertz sine wave that we had in the top image.</p><p>So just to quickly review here. In the last two days, we talked about the Nyquist Theorem as a way to figure out an appropriate sampling rate, that our sampling rate needs to be at least double the highest frequency we want to represent and we talked about how we kind of arrived at 44,100 hertz as a fairly standard sampling rate.</p><p>We also talked about foldover and other effects that can happen when were recording at a sampling rate thats too low. In the next couple of days, were going to get into the question of bit-width and how we decide what resolution we need to represent the amplitude of each sample.</p><p>Thats all for day 092. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823974642/lLVn__exu.png<![CDATA[100 Days Of ML Code — Day 091]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-091-ad227e5c9f0https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-091-ad227e5c9f0Thu, 18 Oct 2018 08:07:24 GMT<![CDATA[<p>100 Days Of ML Code Day 091</p><h2 id="recap-from-day-090">Recap From Day 090</h2><p>Day 090, we looked at Sampling Rate</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-090-bab8d793ff96"><strong>100 Days Of ML Code Day 090</strong><em>Recap From Day 089</em>medium.com</a></p><p>I promised you yesterday that Ill explain it a little bit more formally what a sampling rate is and well look at the Nyquist Theorem, which gives us some guidance on picking a sampling rate.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*IZQghgxJc5_yJ6yL.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823991218/ZJa7Ll_a3o.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiFy8PZw4_eAhWNyoUKHU5-B14QjRx6BAgBEAU&url=http%3A%2F%2Fmusicweb.ucsd.edu%2F~trsmyth%2FdigitalAudio170%2FNyquist_Sampling_Theorem.html&psig=AOvVaw1A-J_52iW6A3b05e9Vagfi&ust=1539936184499927">Source</a></em></p><p>We learned a bit more formally about what sampling rate but didnt get the chance to look at Nyquist Theorem.</p><p>Today, without further ado, lets get to it.</p><h2 id="nyquist-theorem"><strong>Nyquist Theorem</strong></h2><p>The Nyquist Theorem says that the sampling rate must be at least twice the highest frequency that you wish to represent. This makes a lot of intuitive sense when you think about it. And the reason for that is lets think about our sine wave again, here.</p><p>If I have a sine wave going at say 440 hertz then, I have 440 peaks and I have 440 troughs happening every second so the minimum that I need to capture digitally in terms of those dots, those amplitude readings would be for each cycle of my sine wave, I need to make sure that I have at least one sample to represent somewhere on my peak, somewhere above the zero crossing and then, something somewhere below the peak to represent below the zero crossing somewhere down by my trough.</p><p>So I need 440 peaks and 440 troughs or 440 above zeros and 440 below zeros to be able to capture these 440 cycles of my sine wave in a second. So I simply would multiply 440 by 2 and Id end up with 880 as a sampling rate that I would need. So in reality, were not looking at every individual sine wave or frequency component that we want to represent, we want to come up with some general sampling rate thats going to work really well for a lot of things so what should that sampling rate be? We can kind of deduce this logically.</p><p>We talked about the range of human hearing is going from roughly 20 hertz up to 20,000 Hertz. So if we take 20,000 hertz and we multiply it by 2, we end up with 40,000 Hertz. So, we know that the sampling rate must be greater than 40,000 hertz and so the number that we usually end up seeing is 44,100 Hertz. The reason for this has to do with the history of the early days of digital recording and some decisions at Sony and other manufacturers made in the late 1970s that arent really worth getting into here but that number has largely stuck thats what we use on compact discs, in particular, is 44,100 hertz is their sampling rate.</p><p>Youll sometimes see other sampling rates. Youll see like 48,000 hertz for instance. Youll sometimes see higher rates like 96,000 hertz or even 192,000 hertz its in very high fidelity recordings and the reason for that, of course, is that if we had a sine wave that is able to capture at least one sample somewhere on the peak, and one somewhere on the trough, but thats not going to be enough to really capture the entire shape of that sine wave, that entire curve.</p><p>If you want to get a really nice representation of it, youre going to want as many samples as possible all along the way. So, the higher a sampling rate is the better resolution well get and the better we will be able to represent those curves.</p><p>Thats all for day 091. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 091</p><h2 id="recap-from-day-090">Recap From Day 090</h2><p>Day 090, we looked at Sampling Rate</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-090-bab8d793ff96"><strong>100 Days Of ML Code Day 090</strong><em>Recap From Day 089</em>medium.com</a></p><p>I promised you yesterday that Ill explain it a little bit more formally what a sampling rate is and well look at the Nyquist Theorem, which gives us some guidance on picking a sampling rate.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*IZQghgxJc5_yJ6yL.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632823991218/ZJa7Ll_a3o.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwiFy8PZw4_eAhWNyoUKHU5-B14QjRx6BAgBEAU&url=http%3A%2F%2Fmusicweb.ucsd.edu%2F~trsmyth%2FdigitalAudio170%2FNyquist_Sampling_Theorem.html&psig=AOvVaw1A-J_52iW6A3b05e9Vagfi&ust=1539936184499927">Source</a></em></p><p>We learned a bit more formally about what sampling rate but didnt get the chance to look at Nyquist Theorem.</p><p>Today, without further ado, lets get to it.</p><h2 id="nyquist-theorem"><strong>Nyquist Theorem</strong></h2><p>The Nyquist Theorem says that the sampling rate must be at least twice the highest frequency that you wish to represent. This makes a lot of intuitive sense when you think about it. And the reason for that is lets think about our sine wave again, here.</p><p>If I have a sine wave going at say 440 hertz then, I have 440 peaks and I have 440 troughs happening every second so the minimum that I need to capture digitally in terms of those dots, those amplitude readings would be for each cycle of my sine wave, I need to make sure that I have at least one sample to represent somewhere on my peak, somewhere above the zero crossing and then, something somewhere below the peak to represent below the zero crossing somewhere down by my trough.</p><p>So I need 440 peaks and 440 troughs or 440 above zeros and 440 below zeros to be able to capture these 440 cycles of my sine wave in a second. So I simply would multiply 440 by 2 and Id end up with 880 as a sampling rate that I would need. So in reality, were not looking at every individual sine wave or frequency component that we want to represent, we want to come up with some general sampling rate thats going to work really well for a lot of things so what should that sampling rate be? We can kind of deduce this logically.</p><p>We talked about the range of human hearing is going from roughly 20 hertz up to 20,000 Hertz. So if we take 20,000 hertz and we multiply it by 2, we end up with 40,000 Hertz. So, we know that the sampling rate must be greater than 40,000 hertz and so the number that we usually end up seeing is 44,100 Hertz. The reason for this has to do with the history of the early days of digital recording and some decisions at Sony and other manufacturers made in the late 1970s that arent really worth getting into here but that number has largely stuck thats what we use on compact discs, in particular, is 44,100 hertz is their sampling rate.</p><p>Youll sometimes see other sampling rates. Youll see like 48,000 hertz for instance. Youll sometimes see higher rates like 96,000 hertz or even 192,000 hertz its in very high fidelity recordings and the reason for that, of course, is that if we had a sine wave that is able to capture at least one sample somewhere on the peak, and one somewhere on the trough, but thats not going to be enough to really capture the entire shape of that sine wave, that entire curve.</p><p>If you want to get a really nice representation of it, youre going to want as many samples as possible all along the way. So, the higher a sampling rate is the better resolution well get and the better we will be able to represent those curves.</p><p>Thats all for day 091. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632823993482/n_2ngv_HqV.png<![CDATA[100 Days Of ML Code — Day 090]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-090-bab8d793ff96https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-090-bab8d793ff96Wed, 17 Oct 2018 20:44:06 GMT<![CDATA[<p>100 Days Of ML Code Day 090</p><h2 id="recap-from-day-089">Recap From Day 089</h2><p>Day 089, we looked at Copying Analog And Digital.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-089-2409da6cc09a"><strong>100 Days Of ML Code Day 089</strong><em>Recap From Day 088</em>medium.com</a></p><p>If you lose your digital data if it gets corrupted, usually youre just finished you have no semblance of the original left to work with at all, and so this is another point I wanted to make in this relationship between analog and digital.</p><p>So, to review here in the past two days we talked about the basic differences between analog and digital sound as continuous versus discreet representations of audio and we talked about those discreet samples of audio that we get in a digital representation of sound and the issue of horizontal resolution. Sampling rate, how many samples are we taking per second. Bit width, the vertical representation of how much resolution we are using to represent each amplitude dial, and how many channels of sound do we need. We talked about some implications of analog versus digital sound in copying audio and also in the kind of degradation and preservation of audio.</p><p>So, starting today and in the next couple of days, were going to delve into the three ideas of sampling rates, bit widths and channels in much more depth and look at some of the details about how we decide what we need in each of those domains.</p><p>Like I mentioned in the last paragraph, today, were going to delve into the question of sampling rate in much more detail and ask a simple question of how do we determine what the appropriate sampling rate is? Well, explain it a little bit more formally what a sampling rate is and well look at the Nyquist Theorem, which gives us some guidance on picking a sampling rate.</p><p>Later on, well also talk about foldover which is something that can happen usually that we usually dont want to happen if you pick a bad sample that is too low for a particular project.</p><h2 id="sampling-rate">Sampling Rate</h2><p>Sampling rate is very simply put, its the number of samples per second of digital audio.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824000445/Itc3tqWLx.png" alt /></p><p>If you recall we have this kind of very zoomed-in sine wave as seen in the image above. Each of those dots is a sample. Its simply asking, well, how many of those dots are we capturing every second. And because it is in terms of samples per second kind of metric, we actually use hertz to represent it, the same thing we had used to represent frequency.</p><p>For instance, if 8,000 hertz is our sampling rate, it simply means that were capturing 8,000 of these samples every second. So thats how we talk about sampling rate. And now, how do we decide what our sampling rate should be? its actually fairly simple. We use something called the Nyquist Theorem, which is also sometimes known as the sampling theorem.</p><p>Thats all for day 090. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 090</p><h2 id="recap-from-day-089">Recap From Day 089</h2><p>Day 089, we looked at Copying Analog And Digital.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-089-2409da6cc09a"><strong>100 Days Of ML Code Day 089</strong><em>Recap From Day 088</em>medium.com</a></p><p>If you lose your digital data if it gets corrupted, usually youre just finished you have no semblance of the original left to work with at all, and so this is another point I wanted to make in this relationship between analog and digital.</p><p>So, to review here in the past two days we talked about the basic differences between analog and digital sound as continuous versus discreet representations of audio and we talked about those discreet samples of audio that we get in a digital representation of sound and the issue of horizontal resolution. Sampling rate, how many samples are we taking per second. Bit width, the vertical representation of how much resolution we are using to represent each amplitude dial, and how many channels of sound do we need. We talked about some implications of analog versus digital sound in copying audio and also in the kind of degradation and preservation of audio.</p><p>So, starting today and in the next couple of days, were going to delve into the three ideas of sampling rates, bit widths and channels in much more depth and look at some of the details about how we decide what we need in each of those domains.</p><p>Like I mentioned in the last paragraph, today, were going to delve into the question of sampling rate in much more detail and ask a simple question of how do we determine what the appropriate sampling rate is? Well, explain it a little bit more formally what a sampling rate is and well look at the Nyquist Theorem, which gives us some guidance on picking a sampling rate.</p><p>Later on, well also talk about foldover which is something that can happen usually that we usually dont want to happen if you pick a bad sample that is too low for a particular project.</p><h2 id="sampling-rate">Sampling Rate</h2><p>Sampling rate is very simply put, its the number of samples per second of digital audio.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824000445/Itc3tqWLx.png" alt /></p><p>If you recall we have this kind of very zoomed-in sine wave as seen in the image above. Each of those dots is a sample. Its simply asking, well, how many of those dots are we capturing every second. And because it is in terms of samples per second kind of metric, we actually use hertz to represent it, the same thing we had used to represent frequency.</p><p>For instance, if 8,000 hertz is our sampling rate, it simply means that were capturing 8,000 of these samples every second. So thats how we talk about sampling rate. And now, how do we decide what our sampling rate should be? its actually fairly simple. We use something called the Nyquist Theorem, which is also sometimes known as the sampling theorem.</p><p>Thats all for day 090. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824002535/-Fp75Aa-z.png<![CDATA[100 Days Of ML Code — Day 089]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-089-2409da6cc09ahttps://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-089-2409da6cc09aTue, 16 Oct 2018 07:15:40 GMT<![CDATA[<p>100 Days Of ML Code Day 089</p><h2 id="recap-from-day-088">Recap From Day 088</h2><p>Day 088, we looked at the part two of analog versus digital sound.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-088-fd81eb10866b"><strong>100 Days Of ML Code Day 088</strong><em>Recap From Day 087</em>medium.com</a></p><p>Today we will continue from where we left off in day 088</p><h2 id="copying-analog-and-digital">Copying Analog And Digital</h2><p>I said I wanted to take a quick digression. Its not really a digression because its really elemental to the differences between analog and digital media and it has to do with how we copy media.</p><p>When you make a copy of a record. Say if youre dubbing a record to a cassette tape or something like I did as a kid, its not a perfect copy that you end up with because youre copying analog data. Youre copying this continuous function. Its not going to be perfect. Its going through and its essentially rerecording this data as its being played back.</p><p>In the digital domain youre just copying a bunch of zeros and ones and if youre worried you might make a mistake you can go back and check again and again to make sure that you havent made a mistake you know in all kinds of different ways.</p><p>Digital copies are perfect replicas of the originals. There isnt really no sense of a master anymore because every copy can be perfect and this obviously had lots of implications on music sharing and and piracy and legal ramifications.</p><p>Once you can just rip a CD or share a file online all of a sudden and its perfect as opposed to the generational effects of making copies of copies of copies of analog media it can become much more of an issue.</p><p>What I wanted to really talk about here is the implications of analog versus digital in a more artistic sense and to demonstrate this, I want to talk to you about a work by filmmaker Bill Morrison and a composer Michael Gordon its called Light is Calling.</p><p><img src="https://cdn-images-1.medium.com/max/2560/0*YBDkLYLNoKQtHJQg.jpg" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632824009408/tXLK-sAe1V.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwi58-mPtIreAhUlKsAKHWysDPIQjRx6BAgBEAU&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dyx0HzBiaVn4&psig=AOvVaw2h1I71WslDFftLW43umAhA&ust=1539760255277607">Source</a></em></p><p>It was written in 2004 and what Bill Morrison did here was he he took some film footage from some early silent films that was in archives and these film reels were starting to decay.</p><p>If you put these into a projector they might just disintegrate or they might be able to play once or twice before they start falling apart but the images were not as they looked originally in the 20s or the 30s or whenever they were originally made.</p><p>Theyre really transformed and dirty, and theres all kinds of noise, and sometimes its impossible to tell what the original was, sometimes you can make out little bits of it and so he edited a bunch of stuff together from a silent film called The Bells for this piece, and then Michael Gordon actually using digital sound composed a soundtrack to go along with it.</p><p>what I think it really shows is how analog and digital media can decay in different ways because heres this ancient kind of crumbling analog film reel and it still contains some of the original information in it. Digital data doesnt degrade nearly as gracefully.</p><p>Thats all for day 089. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 089</p><h2 id="recap-from-day-088">Recap From Day 088</h2><p>Day 088, we looked at the part two of analog versus digital sound.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-088-fd81eb10866b"><strong>100 Days Of ML Code Day 088</strong><em>Recap From Day 087</em>medium.com</a></p><p>Today we will continue from where we left off in day 088</p><h2 id="copying-analog-and-digital">Copying Analog And Digital</h2><p>I said I wanted to take a quick digression. Its not really a digression because its really elemental to the differences between analog and digital media and it has to do with how we copy media.</p><p>When you make a copy of a record. Say if youre dubbing a record to a cassette tape or something like I did as a kid, its not a perfect copy that you end up with because youre copying analog data. Youre copying this continuous function. Its not going to be perfect. Its going through and its essentially rerecording this data as its being played back.</p><p>In the digital domain youre just copying a bunch of zeros and ones and if youre worried you might make a mistake you can go back and check again and again to make sure that you havent made a mistake you know in all kinds of different ways.</p><p>Digital copies are perfect replicas of the originals. There isnt really no sense of a master anymore because every copy can be perfect and this obviously had lots of implications on music sharing and and piracy and legal ramifications.</p><p>Once you can just rip a CD or share a file online all of a sudden and its perfect as opposed to the generational effects of making copies of copies of copies of analog media it can become much more of an issue.</p><p>What I wanted to really talk about here is the implications of analog versus digital in a more artistic sense and to demonstrate this, I want to talk to you about a work by filmmaker Bill Morrison and a composer Michael Gordon its called Light is Calling.</p><p><img src="https://cdn-images-1.medium.com/max/2560/0*YBDkLYLNoKQtHJQg.jpg" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632824009408/tXLK-sAe1V.html)" /><em><a target="_blank" href="https://www.google.com.ng/url?sa=i&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwi58-mPtIreAhUlKsAKHWysDPIQjRx6BAgBEAU&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3Dyx0HzBiaVn4&psig=AOvVaw2h1I71WslDFftLW43umAhA&ust=1539760255277607">Source</a></em></p><p>It was written in 2004 and what Bill Morrison did here was he he took some film footage from some early silent films that was in archives and these film reels were starting to decay.</p><p>If you put these into a projector they might just disintegrate or they might be able to play once or twice before they start falling apart but the images were not as they looked originally in the 20s or the 30s or whenever they were originally made.</p><p>Theyre really transformed and dirty, and theres all kinds of noise, and sometimes its impossible to tell what the original was, sometimes you can make out little bits of it and so he edited a bunch of stuff together from a silent film called The Bells for this piece, and then Michael Gordon actually using digital sound composed a soundtrack to go along with it.</p><p>what I think it really shows is how analog and digital media can decay in different ways because heres this ancient kind of crumbling analog film reel and it still contains some of the original information in it. Digital data doesnt degrade nearly as gracefully.</p><p>Thats all for day 089. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824012046/86JOpQ14I.jpeg<![CDATA[100 Days Of ML Code — Day 088]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-088-fd81eb10866bhttps://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-088-fd81eb10866bMon, 15 Oct 2018 08:46:50 GMT<![CDATA[<p>100 Days Of ML Code Day 088</p><h2 id="recap-from-day-087">Recap From Day 087</h2><p>Day 087, we looked at analog versus digital sound.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-087-fa09c24823d5"><strong>100 Days Of ML Code Day 087</strong><em>Recap From Day 086</em>medium.com</a></p><p>Today we will continue from where we left off in day 087</p><h2 id="analog-versus-digital-sound-continued">Analog Versus Digital Sound Continued</h2><p>The key difference between analog and digital, is continuous versus discret e. And just to kind of drive that point home I want to go Audacity and below is a sine wave. All Ive done below is Ive zoomed in basically as far as Audacity will go.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824018316/6Fy6gNk0b.png" alt /></p><p>Thats maybe a little bit too far there but you can see those individual dots on it. Each of those represents an individual amplitude value that weve stored digitally and so were down to the lowest level of the digital representation as we zoom in further and further and further.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824020048/QAiB0uR_3.png" alt /></p><p>We can see those individual samples that were captured and of course, Reverb has done a nice job of kind of connecting the dots between them where we dont really know whats happening. We can just interpolate them. So thats the key difference here is this continuous versus discrete.</p><p>In the digital domain if were thinking about what is actually stored at each of those dots, theres a few different considerations that we need to think about and these well be exploring in much more detail in the coming days.</p><p>The first one is sampling rate which is how many of those dots are we recording every second? and how, this has to do with the the horizontal the time resolution that were capturing. How many amplitude values are we recording every second? How fast those dots come one after another?</p><p>The second one is bit width. This has to do with resolution in the amplitude dimension. How many bits of digital data are we reserving to store every single one of our individual amplitude values. So again, bit width has to do with what is our resolution on the y axis of a wave form. Sampling rate has to do with what is our resolution on the x axis.</p><p>The third thing is how many amplitude samples are we recording for each of the values in the image below? because right now we just see a single dot for each of those single amplitude value in our waveform view but might we need to record two channels or three channels or ten channels in order to represent the locations of sound in space.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824021792/XAbpDULPH.png" alt /></p><p>We may need to capture multiple amplitude values at each moment. Its like stereo sound would have two channels, for instance.</p><p>So those are the fundamental things well be exploring over the coming days. But before we do that I said I wanted to take a quick digression. Its not really a digression because its really elemental to the differences between analog and digital media and it has to do with how we copy these media.</p><p>Thats all for day 088. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 088</p><h2 id="recap-from-day-087">Recap From Day 087</h2><p>Day 087, we looked at analog versus digital sound.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-087-fa09c24823d5"><strong>100 Days Of ML Code Day 087</strong><em>Recap From Day 086</em>medium.com</a></p><p>Today we will continue from where we left off in day 087</p><h2 id="analog-versus-digital-sound-continued">Analog Versus Digital Sound Continued</h2><p>The key difference between analog and digital, is continuous versus discret e. And just to kind of drive that point home I want to go Audacity and below is a sine wave. All Ive done below is Ive zoomed in basically as far as Audacity will go.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824018316/6Fy6gNk0b.png" alt /></p><p>Thats maybe a little bit too far there but you can see those individual dots on it. Each of those represents an individual amplitude value that weve stored digitally and so were down to the lowest level of the digital representation as we zoom in further and further and further.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824020048/QAiB0uR_3.png" alt /></p><p>We can see those individual samples that were captured and of course, Reverb has done a nice job of kind of connecting the dots between them where we dont really know whats happening. We can just interpolate them. So thats the key difference here is this continuous versus discrete.</p><p>In the digital domain if were thinking about what is actually stored at each of those dots, theres a few different considerations that we need to think about and these well be exploring in much more detail in the coming days.</p><p>The first one is sampling rate which is how many of those dots are we recording every second? and how, this has to do with the the horizontal the time resolution that were capturing. How many amplitude values are we recording every second? How fast those dots come one after another?</p><p>The second one is bit width. This has to do with resolution in the amplitude dimension. How many bits of digital data are we reserving to store every single one of our individual amplitude values. So again, bit width has to do with what is our resolution on the y axis of a wave form. Sampling rate has to do with what is our resolution on the x axis.</p><p>The third thing is how many amplitude samples are we recording for each of the values in the image below? because right now we just see a single dot for each of those single amplitude value in our waveform view but might we need to record two channels or three channels or ten channels in order to represent the locations of sound in space.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824021792/XAbpDULPH.png" alt /></p><p>We may need to capture multiple amplitude values at each moment. Its like stereo sound would have two channels, for instance.</p><p>So those are the fundamental things well be exploring over the coming days. But before we do that I said I wanted to take a quick digression. Its not really a digression because its really elemental to the differences between analog and digital media and it has to do with how we copy these media.</p><p>Thats all for day 088. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824023811/9ohPVp_u9.png<![CDATA[100 Days Of ML Code — Day 087]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-087-fa09c24823d5https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-087-fa09c24823d5Fri, 12 Oct 2018 22:58:24 GMT<![CDATA[<p>100 Days Of ML Code Day 087</p><h2 id="recap-from-day-086">Recap From Day 086</h2><p>Day 086, we looked at Envelope And Spectrogram</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-086-c93f8d240910"><strong>100 Days Of ML Code Day 086</strong><em>Recap From Day 085</em>medium.com</a></p><p>In the past two days, we looked at timbre. We talked about timbre as consisting of two components, spectra, and envelope. We talked briefly about the Fourier theorem for periodic sounds, how it describes them as consisting of a series of sine waves particular frequencies in integer multiple relationships to each other.</p><p>We talked about two new visual representations of sound. The spectra, which shows, at a particular moment what the frequency content is, and a sonogram, which shows over time how that frequency component is changing.</p><p>In the next couple of days, we are going to shift gears a little bit and focus on how we represent sound digitally on a computer and all the issues that come up with that.</p><p>So starting today and the next several days were going to focus in on digital sound, how we represent audio waveforms digitally on a computer or a compact disc or any other kind of digital media. Well talk about a lot of the issues that come up with that in particular.</p><p>Today, were going to look broadly at the differences between analog and digital sound and different challenges that come up in each media and then were also going to take a little bit of a digression to talk about some of the implications of copying audio and analog versus digital domains and preserving it and archiving it.</p><p>I want to start by talking about analog versus digital sound and the key idea here is that analog is a continuous medium and digital is a discrete medium. so its kind of flagship examples of each recording media.</p><p>Lets think about vinyl records as being the quintessential analog audio media, and compact disks as being a quintessential digital media.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824031127/PA43q-Upo.png" alt /></p><p>So on a record, we have a needle thats reading from these grooves and those grooves are going up and down and thats creating those different amplitude values that are being reproduced by a record player. So this is a continuous process. its a continuous function.</p><p>We can think about it as being y equals f of x or something like that where at any arbitrary continuous moment in time there is an amplitude value to work with that. So that s in the analog realm.</p><p>When we move to the digital realm, you think of the compact disc. We have a laser that is reading a bunch of zeroes and ones off of our compact disc. Zeroes and ones are discrete, and theyre representing discrete moments in time. Discrete amplitude values at particular points. Its no longer a continuous function. Its a discrete function.</p><p>We decide what moments in time we want to record these amplitude values and then we capture them in those moments, we play them back in those moments, we knew nothing about what happens from in between these discrete points in time where were taking these, amplitude samples.</p><p>And so this is the key difference between analog and digital, there is continuous versus discrete. Thats all for day 087. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 087</p><h2 id="recap-from-day-086">Recap From Day 086</h2><p>Day 086, we looked at Envelope And Spectrogram</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-086-c93f8d240910"><strong>100 Days Of ML Code Day 086</strong><em>Recap From Day 085</em>medium.com</a></p><p>In the past two days, we looked at timbre. We talked about timbre as consisting of two components, spectra, and envelope. We talked briefly about the Fourier theorem for periodic sounds, how it describes them as consisting of a series of sine waves particular frequencies in integer multiple relationships to each other.</p><p>We talked about two new visual representations of sound. The spectra, which shows, at a particular moment what the frequency content is, and a sonogram, which shows over time how that frequency component is changing.</p><p>In the next couple of days, we are going to shift gears a little bit and focus on how we represent sound digitally on a computer and all the issues that come up with that.</p><p>So starting today and the next several days were going to focus in on digital sound, how we represent audio waveforms digitally on a computer or a compact disc or any other kind of digital media. Well talk about a lot of the issues that come up with that in particular.</p><p>Today, were going to look broadly at the differences between analog and digital sound and different challenges that come up in each media and then were also going to take a little bit of a digression to talk about some of the implications of copying audio and analog versus digital domains and preserving it and archiving it.</p><p>I want to start by talking about analog versus digital sound and the key idea here is that analog is a continuous medium and digital is a discrete medium. so its kind of flagship examples of each recording media.</p><p>Lets think about vinyl records as being the quintessential analog audio media, and compact disks as being a quintessential digital media.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824031127/PA43q-Upo.png" alt /></p><p>So on a record, we have a needle thats reading from these grooves and those grooves are going up and down and thats creating those different amplitude values that are being reproduced by a record player. So this is a continuous process. its a continuous function.</p><p>We can think about it as being y equals f of x or something like that where at any arbitrary continuous moment in time there is an amplitude value to work with that. So that s in the analog realm.</p><p>When we move to the digital realm, you think of the compact disc. We have a laser that is reading a bunch of zeroes and ones off of our compact disc. Zeroes and ones are discrete, and theyre representing discrete moments in time. Discrete amplitude values at particular points. Its no longer a continuous function. Its a discrete function.</p><p>We decide what moments in time we want to record these amplitude values and then we capture them in those moments, we play them back in those moments, we knew nothing about what happens from in between these discrete points in time where were taking these, amplitude samples.</p><p>And so this is the key difference between analog and digital, there is continuous versus discrete. Thats all for day 087. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824033839/pIEptPxRg.png<![CDATA[100 Days Of ML Code — Day 086]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-086-c93f8d240910https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-086-c93f8d240910Thu, 11 Oct 2018 22:18:07 GMT<![CDATA[<p>100 Days Of ML Code Day 086</p><h2 id="recap-from-day-085">Recap From Day 085</h2><p>Day 085, we looked at Fourier Theorem.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-085-75d16b03aa8f"><strong>100 Days Of ML Code Day 085</strong><em>Recap From Day 084</em>medium.com</a></p><p>Today, we will continue from where we left off in day 085</p><h2 id="envelope-and-spectrogram">Envelope And Spectrogram</h2><p>Now another way that we can look at Timbre information visually Its just something called a sonogram or a spectrogram and the image below is a little bit different because what we have on the x-axis now, is time. And our y-axis is frequency, and then our colour would map to decibels.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*XbhPsKaoWamGevwU.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632824042666/cVr0pEkU-.png)" /><em><a target="_blank" href="https://upload.wikimedia.org/wikipedia/commons/c/c5/Spectrogram-19thC.png">Source</a></em></p><p>So the reddest areas in the colour scheme are the ones that are highest in decibels. So any given point, we can think of as a particular moment in time at a particular place in frequency space and the colour is an indication of the decibels so that particular moment in time in that particular frequency space.</p><p>The reason sonograms and spectrograms are important is that we obviously have sounds in the real world that arent sine waves or sawtooth waves or square waves, that change a lot over the course of the sound and this is a key component to timbre as well.</p><p>Its not just enough to say how frequencies are distributed and where the energy is across the frequency space but you also have to be able to say well how its changing over time.</p><p>It would not be enough just to list a bunch of frequencies and their amplitudes and phases in order to describe an instruments sound because we have to describe how its changing at the beginning part, the attack portion of the sound. We have to describe its envelope how its changing over time.</p><p>Below is a live sonogram view of a sawtooth wave.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824046043/ZTzgeN18Q.png" alt /></p><p>You can see that its just those straight lines. Those frequency components according to the Fourier Theorem that is never changing. See the image below for how a more complex sound looks.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824048209/dV5jSa5WQ.png" alt /></p><p>Now that is obviously changing because the pitches are changing, but, equally important is that with each of the notes, you can see from the image above that theyre not just static lines. Those things that are growing, and shrinking, and moving around and they look like real almost drawings or squiggles.</p><p>Rather than simple straight lines that are perfect. That is how theres a difference between the sounds that we work with in real life as opposed to the test sounds, the sawtooth waves and what we need to describe their timbres. Its not enough to just say what the vertical, the frequency component is but you need to describe the horizontal as well, how its changing over time.</p><p>Thats all for day 086. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 086</p><h2 id="recap-from-day-085">Recap From Day 085</h2><p>Day 085, we looked at Fourier Theorem.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-085-75d16b03aa8f"><strong>100 Days Of ML Code Day 085</strong><em>Recap From Day 084</em>medium.com</a></p><p>Today, we will continue from where we left off in day 085</p><h2 id="envelope-and-spectrogram">Envelope And Spectrogram</h2><p>Now another way that we can look at Timbre information visually Its just something called a sonogram or a spectrogram and the image below is a little bit different because what we have on the x-axis now, is time. And our y-axis is frequency, and then our colour would map to decibels.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*XbhPsKaoWamGevwU.png" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632824042666/cVr0pEkU-.png)" /><em><a target="_blank" href="https://upload.wikimedia.org/wikipedia/commons/c/c5/Spectrogram-19thC.png">Source</a></em></p><p>So the reddest areas in the colour scheme are the ones that are highest in decibels. So any given point, we can think of as a particular moment in time at a particular place in frequency space and the colour is an indication of the decibels so that particular moment in time in that particular frequency space.</p><p>The reason sonograms and spectrograms are important is that we obviously have sounds in the real world that arent sine waves or sawtooth waves or square waves, that change a lot over the course of the sound and this is a key component to timbre as well.</p><p>Its not just enough to say how frequencies are distributed and where the energy is across the frequency space but you also have to be able to say well how its changing over time.</p><p>It would not be enough just to list a bunch of frequencies and their amplitudes and phases in order to describe an instruments sound because we have to describe how its changing at the beginning part, the attack portion of the sound. We have to describe its envelope how its changing over time.</p><p>Below is a live sonogram view of a sawtooth wave.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824046043/ZTzgeN18Q.png" alt /></p><p>You can see that its just those straight lines. Those frequency components according to the Fourier Theorem that is never changing. See the image below for how a more complex sound looks.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824048209/dV5jSa5WQ.png" alt /></p><p>Now that is obviously changing because the pitches are changing, but, equally important is that with each of the notes, you can see from the image above that theyre not just static lines. Those things that are growing, and shrinking, and moving around and they look like real almost drawings or squiggles.</p><p>Rather than simple straight lines that are perfect. That is how theres a difference between the sounds that we work with in real life as opposed to the test sounds, the sawtooth waves and what we need to describe their timbres. Its not enough to just say what the vertical, the frequency component is but you need to describe the horizontal as well, how its changing over time.</p><p>Thats all for day 086. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824050457/PJQ_Mhf6YX.png<![CDATA[100 Days Of ML Code — Day 085]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-085-75d16b03aa8fhttps://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-085-75d16b03aa8fWed, 10 Oct 2018 21:20:31 GMT<![CDATA[<p>100 Days Of ML Code Day 085</p><h2 id="recap-from-day-084">Recap From Day 084</h2><p>Day 084, we looked at how we describe two sounds that actually have the exact same pitch and the exact same loudness but they sound really different from each other.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-084-ed7849344ca0"><strong>100 Days Of ML Code Day 084</strong><em>Recap From Day 083</em>medium.com</a></p><p>Today, we will continue from where we left off in day 084.</p><h2 id="fourier-theorem">Fourier Theorem</h2><p>As seen in the image to the left below, we have a saw tooth wave, and so you see a number of peaks highlighted in red. So, you see a peak at 440 Hertz, and another one at 880, and at 1320 and 1760, and so on and so forth.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824058058/5SJBZs8xB.png" alt /></p><p>Those are all coming at integer multiples at the fundamental frequency 400. Theyre at one times, two times, three times, four times 5 times, 6 times and on and on and on and you can notice each one is at lower decibels than the one that came before it. So our most energy is at 440 and then it goes down and down and down and down from there.</p><p>The recording above might be a little bit loud, so watch the volume on your headphones, or your speakers. It is a sawtooth wave, same frequency as the sine wave was but obviously, it has a very different timbre. And part of the way that we can explain this is because of the different spectra. Because they both have their peak at 440 Hertz (as highlighted in blue in the image below), but obviously theres all this extra stuff happening in the different frequencies.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824060173/HTDPvPRnn.png" alt /></p><p>As weve seen before, were not hearing all those as separate sine waves or separate frequency components. Theyre all kind of blending together to create this single sound in our minds and the way that we can understand this is through a very important theorem in music technology called the Fourier Theorem.</p><p>What the Fourier Theorem says is that any periodic waveform can be represented as the sum of sine waves at frequencies that are integer multiples of a fundamental frequency like our fundamental frequency in this case was 440 Hertz. In the integer multiples are what we saw before, 1 times 440, 2 times 440, 3 times 440, and so on and so forth.</p><p>What were looking at is at each of those integer multiples, we have a sine wave at some particular frequency, amplitude, and phase and if we add those together, we can represent any periodic wave form, like a sawtooth wave, or a triangle wave, or something like that.</p><p>Now it is important to emphasize the word periodic here. This is a very important caveat because in the real world, like waveform of a recording of someone talking, is not periodic, the cycles dont repeat infinitely and infinitely and infinitely the way that sine wave would, or a sawtooth wave or something like that. So the Fourier Theorem only works for periodic wave forms.</p><p>Thats all for day 085. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 085</p><h2 id="recap-from-day-084">Recap From Day 084</h2><p>Day 084, we looked at how we describe two sounds that actually have the exact same pitch and the exact same loudness but they sound really different from each other.</p><p>You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-084-ed7849344ca0"><strong>100 Days Of ML Code Day 084</strong><em>Recap From Day 083</em>medium.com</a></p><p>Today, we will continue from where we left off in day 084.</p><h2 id="fourier-theorem">Fourier Theorem</h2><p>As seen in the image to the left below, we have a saw tooth wave, and so you see a number of peaks highlighted in red. So, you see a peak at 440 Hertz, and another one at 880, and at 1320 and 1760, and so on and so forth.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824058058/5SJBZs8xB.png" alt /></p><p>Those are all coming at integer multiples at the fundamental frequency 400. Theyre at one times, two times, three times, four times 5 times, 6 times and on and on and on and you can notice each one is at lower decibels than the one that came before it. So our most energy is at 440 and then it goes down and down and down and down from there.</p><p>The recording above might be a little bit loud, so watch the volume on your headphones, or your speakers. It is a sawtooth wave, same frequency as the sine wave was but obviously, it has a very different timbre. And part of the way that we can explain this is because of the different spectra. Because they both have their peak at 440 Hertz (as highlighted in blue in the image below), but obviously theres all this extra stuff happening in the different frequencies.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824060173/HTDPvPRnn.png" alt /></p><p>As weve seen before, were not hearing all those as separate sine waves or separate frequency components. Theyre all kind of blending together to create this single sound in our minds and the way that we can understand this is through a very important theorem in music technology called the Fourier Theorem.</p><p>What the Fourier Theorem says is that any periodic waveform can be represented as the sum of sine waves at frequencies that are integer multiples of a fundamental frequency like our fundamental frequency in this case was 440 Hertz. In the integer multiples are what we saw before, 1 times 440, 2 times 440, 3 times 440, and so on and so forth.</p><p>What were looking at is at each of those integer multiples, we have a sine wave at some particular frequency, amplitude, and phase and if we add those together, we can represent any periodic wave form, like a sawtooth wave, or a triangle wave, or something like that.</p><p>Now it is important to emphasize the word periodic here. This is a very important caveat because in the real world, like waveform of a recording of someone talking, is not periodic, the cycles dont repeat infinitely and infinitely and infinitely the way that sine wave would, or a sawtooth wave or something like that. So the Fourier Theorem only works for periodic wave forms.</p><p>Thats all for day 085. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824062549/7o-ObQMSt.png<![CDATA[100 Days Of ML Code — Day 084]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-084-ed7849344ca0https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-084-ed7849344ca0Tue, 09 Oct 2018 18:45:10 GMT<![CDATA[<p>100 Days Of ML Code Day 084</p><h2 id="recap-from-day-083">Recap From Day 083</h2><p>Day 083, we looked at Fletcher-Munson Loudness Curves. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-083-ebb84e0af64d"><strong>100 Days Of ML Code Day 083</strong><em>Recap From Day 082</em>medium.com</a></p><p>What were going to talk about today and the next couple days is if we had two sounds that actually have the exact same pitch and the exact same loudness but they sound really different from each other, How do we describe that? So well be talking about timbre today.</p><h2 id="timbre">Timbre</h2><p>So in the last couple of days, we looked at loudness and we looked at pitch and I asked the questions of how can we distinguish two sounds that have the same loudness and the same pitch and yet they sound very different. Thats where the notion of Timbre comes in.</p><p>Just to start, lets say I play you the sound of a trombone and then a double-base sound. Same pitch, more or less the same loudness, but it sounds very different. How do we describe the differences between them? Thats what we are going to talk about.</p><p>In the coming days, we will look at what is timbre? We will Describe it in terms of two components spectra and envelope and also we will look at some different ways we can look at sound besides the waveform view weve been using, that show the timbre of sounds a little bit more clearly.</p><p>Lets talk about the Fourier Theorem which helps explain how timbre works. There are no decent definitions of timbre out there. But lets look at my favourite definition which comes from the American Standards Association and this is how it goes. It says, that attribute of sensation in terms of which a listener can judge that two sounds having the same loudness and pitch are dissimilar.</p><p>The language in the definition above is a little fancy but basically what its saying is. If theyve got the same loudness, and theyve got the same pitch, but they still sound different to you, well, thats timbre. So basically its just a grab bag of everything else about a sound that cant be described by its loudness and pitch.</p><p>The definition above is really just saying what its not rather than what it is. Colloquially we tend to define timbre as the colour or the tone or something like that, and thats fine because It gives us a general sense of what were talking about with timbre, but it doesnt get into a specifics its really just a metaphor. They explain, in this vague way, this other stuff that we dont really have a good way to describe.</p><p>The way were going to talk about timbre going forward is in terms of two key components, spectra and envelope. So Im going to talk about this in a little bit more detail. In order to do that we really need to look at visual representations of sounds. Remember that up to now weve been doing the waveform representation of sounds, where our X axis is time and our Y axis is amplitude.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824069832/q1BqJGzXzy.png" alt="sound spectrum" /><em>sound spectrum</em></p><p>What we have in the image above is a sound spectrum, a visual spectrum representation of sounds which is basically showing where theres energy in different frequencies. So what we have on our x-axis is frequency and our y-axis is decibels. And so we can see in the one on the left, it is a sine wave at 440 Hertz. Its a little hard to read the units on the diagram above but it is 440 Hertz on my x-axis and you can see the peak right there(the area highlighted by a red circle in), showing that the sine waves peak energy is at the 440 Hertz point.</p><p>Thats all for day 084. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 084</p><h2 id="recap-from-day-083">Recap From Day 083</h2><p>Day 083, we looked at Fletcher-Munson Loudness Curves. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-083-ebb84e0af64d"><strong>100 Days Of ML Code Day 083</strong><em>Recap From Day 082</em>medium.com</a></p><p>What were going to talk about today and the next couple days is if we had two sounds that actually have the exact same pitch and the exact same loudness but they sound really different from each other, How do we describe that? So well be talking about timbre today.</p><h2 id="timbre">Timbre</h2><p>So in the last couple of days, we looked at loudness and we looked at pitch and I asked the questions of how can we distinguish two sounds that have the same loudness and the same pitch and yet they sound very different. Thats where the notion of Timbre comes in.</p><p>Just to start, lets say I play you the sound of a trombone and then a double-base sound. Same pitch, more or less the same loudness, but it sounds very different. How do we describe the differences between them? Thats what we are going to talk about.</p><p>In the coming days, we will look at what is timbre? We will Describe it in terms of two components spectra and envelope and also we will look at some different ways we can look at sound besides the waveform view weve been using, that show the timbre of sounds a little bit more clearly.</p><p>Lets talk about the Fourier Theorem which helps explain how timbre works. There are no decent definitions of timbre out there. But lets look at my favourite definition which comes from the American Standards Association and this is how it goes. It says, that attribute of sensation in terms of which a listener can judge that two sounds having the same loudness and pitch are dissimilar.</p><p>The language in the definition above is a little fancy but basically what its saying is. If theyve got the same loudness, and theyve got the same pitch, but they still sound different to you, well, thats timbre. So basically its just a grab bag of everything else about a sound that cant be described by its loudness and pitch.</p><p>The definition above is really just saying what its not rather than what it is. Colloquially we tend to define timbre as the colour or the tone or something like that, and thats fine because It gives us a general sense of what were talking about with timbre, but it doesnt get into a specifics its really just a metaphor. They explain, in this vague way, this other stuff that we dont really have a good way to describe.</p><p>The way were going to talk about timbre going forward is in terms of two key components, spectra and envelope. So Im going to talk about this in a little bit more detail. In order to do that we really need to look at visual representations of sounds. Remember that up to now weve been doing the waveform representation of sounds, where our X axis is time and our Y axis is amplitude.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824069832/q1BqJGzXzy.png" alt="sound spectrum" /><em>sound spectrum</em></p><p>What we have in the image above is a sound spectrum, a visual spectrum representation of sounds which is basically showing where theres energy in different frequencies. So what we have on our x-axis is frequency and our y-axis is decibels. And so we can see in the one on the left, it is a sine wave at 440 Hertz. Its a little hard to read the units on the diagram above but it is 440 Hertz on my x-axis and you can see the peak right there(the area highlighted by a red circle in), showing that the sine waves peak energy is at the 440 Hertz point.</p><p>Thats all for day 084. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824072193/NO9qkKFve.png<![CDATA[100 Days Of ML Code — Day 083]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-083-ebb84e0af64dhttps://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-083-ebb84e0af64dSun, 07 Oct 2018 20:20:34 GMT<![CDATA[<p>100 Days Of ML Code Day 083</p><h2 id="recap-from-day-082">Recap From Day 082</h2><p>Day 082, we looked at Harmonic Series. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-082-cff5bb359f1e"><strong>100 Days Of ML Code Day 082</strong><em>Recap From Day 081</em>medium.com</a></p><p>Today, we will continue from where we left off in day 082</p><h2 id="fletcher-munson-loudness-curves">Fletcher-Munson Loudness Curves.</h2><p>Lets look at one more thing before we leave our discussion of psychophysics for now. I want you to play the chirp sound below. I want you to focus on how loud it sounds over the course of the chirp from 20 Hertz to 20,000 Hertz.</p><p>Does it sound like its ever getting louder or softer or does it feel like the loudness is the same the whole time? Okay, so the loudness is obviously changing as that goes from 20 Hertz up to 20,000 Hertz. The amplitude of that sine wave in the file is actually not changing at all. Its using the full negative one to positive one range throughout but our perception of that is changing based on the frequency of the sin wave.</p><p>This is explained by this phenomenon called the Fletcher-Munson Loudness Curves.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*NAuIlksdkLNL3VyB.gif" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632824079898/VeZ6ZpjFE.gif)" /><em><a target="_blank" href="http://2.bp.blogspot.com/-ZId8CS3_dk8/UCmyZbn1gKI/AAAAAAAADMg/CnE0RsxEE2Q/s1600/Countours%2Bfor%2BBlog.gif">Source</a></em></p><p>What Fletcher-Munson Loudness Curves shows is that on our y-axis we have decibels, on our x-axis we have frequency. If we follow one of the contours above, if were changing our loudness as we go up we actually perceive that curve as being the exact same loudness throughout.</p><p>So in order to get something that sounds like its equally loud from 20 Hertz all the way up to 20,000 Hertz we actually have to change its amplitude, in order to kind of fake our ears into hearing it sound like its the same. Because our ears are more sensitive, to a broader range of dynamics, especially around 3 to 5,000 Hertz.</p><p>Then lets say at the very low end of the spectrum or even at the very high end. So this is another example about how frequency and loudness come together in our brains as were hearing sounds to create effects that are very different from what we might see if were just looking at a waveform.</p><p>So to review what weve covered in the past three days, we talked about psychoacoustics as describing how we perceive sound and not just how it exists acoustically in the world or how we might represent it as a waveform. We particularly talked about loudness versus amplitude, and we talked about pitch versus frequency. We looked at the Fletcher-Munson Loudness Curves as a really good example of this.</p><p>Thats all for day 083. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 083</p><h2 id="recap-from-day-082">Recap From Day 082</h2><p>Day 082, we looked at Harmonic Series. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-082-cff5bb359f1e"><strong>100 Days Of ML Code Day 082</strong><em>Recap From Day 081</em>medium.com</a></p><p>Today, we will continue from where we left off in day 082</p><h2 id="fletcher-munson-loudness-curves">Fletcher-Munson Loudness Curves.</h2><p>Lets look at one more thing before we leave our discussion of psychophysics for now. I want you to play the chirp sound below. I want you to focus on how loud it sounds over the course of the chirp from 20 Hertz to 20,000 Hertz.</p><p>Does it sound like its ever getting louder or softer or does it feel like the loudness is the same the whole time? Okay, so the loudness is obviously changing as that goes from 20 Hertz up to 20,000 Hertz. The amplitude of that sine wave in the file is actually not changing at all. Its using the full negative one to positive one range throughout but our perception of that is changing based on the frequency of the sin wave.</p><p>This is explained by this phenomenon called the Fletcher-Munson Loudness Curves.</p><p><img src="https://cdn-images-1.medium.com/max/2000/0*NAuIlksdkLNL3VyB.gif" alt="[Source](https://cdn.hashnode.com/res/hashnode/image/upload/v1632824079898/VeZ6ZpjFE.gif)" /><em><a target="_blank" href="http://2.bp.blogspot.com/-ZId8CS3_dk8/UCmyZbn1gKI/AAAAAAAADMg/CnE0RsxEE2Q/s1600/Countours%2Bfor%2BBlog.gif">Source</a></em></p><p>What Fletcher-Munson Loudness Curves shows is that on our y-axis we have decibels, on our x-axis we have frequency. If we follow one of the contours above, if were changing our loudness as we go up we actually perceive that curve as being the exact same loudness throughout.</p><p>So in order to get something that sounds like its equally loud from 20 Hertz all the way up to 20,000 Hertz we actually have to change its amplitude, in order to kind of fake our ears into hearing it sound like its the same. Because our ears are more sensitive, to a broader range of dynamics, especially around 3 to 5,000 Hertz.</p><p>Then lets say at the very low end of the spectrum or even at the very high end. So this is another example about how frequency and loudness come together in our brains as were hearing sounds to create effects that are very different from what we might see if were just looking at a waveform.</p><p>So to review what weve covered in the past three days, we talked about psychoacoustics as describing how we perceive sound and not just how it exists acoustically in the world or how we might represent it as a waveform. We particularly talked about loudness versus amplitude, and we talked about pitch versus frequency. We looked at the Fletcher-Munson Loudness Curves as a really good example of this.</p><p>Thats all for day 083. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824081803/lA5oOodPZ.gif<![CDATA[100 Days Of ML Code — Day 082]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-082-cff5bb359f1ehttps://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-082-cff5bb359f1eSat, 06 Oct 2018 19:51:41 GMT<![CDATA[<p>100 Days Of ML Code Day 082</p><h2 id="recap-from-day-081">Recap From Day 081</h2><p>Day 081, we looked at the third part of loudness and pitch. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-081-7ae86d953f48"><strong>100 Days Of ML Code Day 081</strong><em>Recap From Day 080</em>medium.com</a></p><p>Today, we will continue from where we left off in day 081</p><h2 id="harmonic-series">Harmonic Series</h2><p>Toward the end of day 081, we saw an image that contains two frequencies two sine waves, one at 440 Hertz, and the second one at 880 Hertz. I concluded by asking what happens if we actually listen to them?</p><p>Play the two audio above to hear what they sound like. When you played those together, the 440 Hertz sine wave and the 880 Hertz sine wave, how many different pitches did you hear? Okay, thats enough to get an idea. But if you go ahead and play the 440 Hertz one, you hear that very clearly or play just the 880 Hertz one, you hear that very clearly. But when you play them together youll hear something very different.</p><p>We find some melding of those because they have this special relationship to one another and this is something thats even more evident if we go to real-world sounds.</p><p>If you play the video above, the sound youd hear is a trombone sound but were not actually hearing the original trombone sounds here. Were hearing a bunch of sine waves. This is actually a harmonic series. A <strong>harmonic series</strong> is the sequence of sounds pure tones, represented by sinusoidal waves in which the frequency of each sound is an integer multiple of the fundamental, the lowest frequency.</p><p>The point is that its not just the difference between the linear and logarithmic relationship in terms of frequency pitch but its also a difference between making out individual frequencies and hearing them melding into some bigger composite results</p><p>Thats all for day 082. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 082</p><h2 id="recap-from-day-081">Recap From Day 081</h2><p>Day 081, we looked at the third part of loudness and pitch. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-081-7ae86d953f48"><strong>100 Days Of ML Code Day 081</strong><em>Recap From Day 080</em>medium.com</a></p><p>Today, we will continue from where we left off in day 081</p><h2 id="harmonic-series">Harmonic Series</h2><p>Toward the end of day 081, we saw an image that contains two frequencies two sine waves, one at 440 Hertz, and the second one at 880 Hertz. I concluded by asking what happens if we actually listen to them?</p><p>Play the two audio above to hear what they sound like. When you played those together, the 440 Hertz sine wave and the 880 Hertz sine wave, how many different pitches did you hear? Okay, thats enough to get an idea. But if you go ahead and play the 440 Hertz one, you hear that very clearly or play just the 880 Hertz one, you hear that very clearly. But when you play them together youll hear something very different.</p><p>We find some melding of those because they have this special relationship to one another and this is something thats even more evident if we go to real-world sounds.</p><p>If you play the video above, the sound youd hear is a trombone sound but were not actually hearing the original trombone sounds here. Were hearing a bunch of sine waves. This is actually a harmonic series. A <strong>harmonic series</strong> is the sequence of sounds pure tones, represented by sinusoidal waves in which the frequency of each sound is an integer multiple of the fundamental, the lowest frequency.</p><p>The point is that its not just the difference between the linear and logarithmic relationship in terms of frequency pitch but its also a difference between making out individual frequencies and hearing them melding into some bigger composite results</p><p>Thats all for day 082. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824089470/5zWpZ8ZZW.jpeg<![CDATA[100 Days Of ML Code — Day 081]]>https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-081-7ae86d953f48https://jehoshaphatia.hashnode.dev/100-days-of-ml-code-day-081-7ae86d953f48Fri, 05 Oct 2018 20:03:41 GMT<![CDATA[<p>100 Days Of ML Code Day 081</p><h2 id="recap-from-day-080">Recap From Day 080</h2><p>Day 080, we looked at loudness and pitch. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-080-932e751d577b"><strong>100 Days Of ML Code Day 080</strong><em>Recap From Day 079</em>medium.com</a></p><p>Today, we will continue from where we left off in day 080</p><h2 id="loudness-continued">Loudness Continued</h2><p>As were thinking about frequency, we think about frequency as going up linearly and theres a key musical construct thats described. Its called the Harmonic Series. If we have a base frequency at say 100 Hertz, well we can think of integer multiples of that. So 2 times 100 is 200, 3 times would be 300, 4 times 400, and so on, 500, 600 and on and on and on. Harmonic Series is very important in music and we can think about the Hertz as representing our base frequency.</p><p>If we were to represent the low C in the image below then when we double that frequency we would be in the C an octave above. When we go three times that original frequency, we would be the G above that and if we went four times we would be the C above that and so were not always getting Cs. Were getting different notes.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824097117/1Y90wC1lk.png" alt /></p><p>If we went from there we would get an E and we get a G and then a kind of B flat and so on and so forth. But theres another way to think about pitch which is in terms of octaves and this is not a linear scale of 1 times, 2 times, 3 times, 4 times anymore. This is a scale of doubling every time. So 100, 200 Hertz, 400 Hertz, 800 Hertz, 1600 Hertz and so on and so forth.</p><p>If we go at those frequency ratios always doubling or rather than always multiplying by some integer multiple, we end up with successive octaves where theyre all Cs, from C to C to C to C and so you see we got C, we double it, we get the C the next octave up as seen below. We double that, we get the C the next octave up. We double that, we get to see the next octave up.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824099179/FkiiGnpzEw.png" alt /></p><p>And so again, the way that we hear pitch, is not on this linear frequency scale, when theres logarithmic octave scale because we hear these Cs as sharing something in common with each other and going from one C to the next is traversing this space of an octave even though the difference between one 100 and 200 Hertz and between 200 and 400 Hertz is different in Hertz space is 100 versus 200.</p><p>So again theres this difference between how we represent things in frequency and how we hear them in terms of these octaves. This pitch, this logarithmic relationship. I want to go a little bit further than that because we hear something else thats a little bit more complicated too when were listening to pitch instead of frequencies.</p><p>The image below contains two frequencies two sine waves, one is at 440 Hertz, the one on top, and then the one on the bottom is at 880 Hertz.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824102799/YgqsDCXBh.png" alt /></p><p>So there is a two to one relationships, theyre an octave apart from each other. Now what happens if we actually listen to them? Thats all for day 081. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]><![CDATA[<p>100 Days Of ML Code Day 081</p><h2 id="recap-from-day-080">Recap From Day 080</h2><p>Day 080, we looked at loudness and pitch. You can catch up using the link below.<a target="_blank" href="https://medium.com/@jehoshaphatia/100-days-of-ml-code-day-080-932e751d577b"><strong>100 Days Of ML Code Day 080</strong><em>Recap From Day 079</em>medium.com</a></p><p>Today, we will continue from where we left off in day 080</p><h2 id="loudness-continued">Loudness Continued</h2><p>As were thinking about frequency, we think about frequency as going up linearly and theres a key musical construct thats described. Its called the Harmonic Series. If we have a base frequency at say 100 Hertz, well we can think of integer multiples of that. So 2 times 100 is 200, 3 times would be 300, 4 times 400, and so on, 500, 600 and on and on and on. Harmonic Series is very important in music and we can think about the Hertz as representing our base frequency.</p><p>If we were to represent the low C in the image below then when we double that frequency we would be in the C an octave above. When we go three times that original frequency, we would be the G above that and if we went four times we would be the C above that and so were not always getting Cs. Were getting different notes.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824097117/1Y90wC1lk.png" alt /></p><p>If we went from there we would get an E and we get a G and then a kind of B flat and so on and so forth. But theres another way to think about pitch which is in terms of octaves and this is not a linear scale of 1 times, 2 times, 3 times, 4 times anymore. This is a scale of doubling every time. So 100, 200 Hertz, 400 Hertz, 800 Hertz, 1600 Hertz and so on and so forth.</p><p>If we go at those frequency ratios always doubling or rather than always multiplying by some integer multiple, we end up with successive octaves where theyre all Cs, from C to C to C to C and so you see we got C, we double it, we get the C the next octave up as seen below. We double that, we get the C the next octave up. We double that, we get to see the next octave up.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824099179/FkiiGnpzEw.png" alt /></p><p>And so again, the way that we hear pitch, is not on this linear frequency scale, when theres logarithmic octave scale because we hear these Cs as sharing something in common with each other and going from one C to the next is traversing this space of an octave even though the difference between one 100 and 200 Hertz and between 200 and 400 Hertz is different in Hertz space is 100 versus 200.</p><p>So again theres this difference between how we represent things in frequency and how we hear them in terms of these octaves. This pitch, this logarithmic relationship. I want to go a little bit further than that because we hear something else thats a little bit more complicated too when were listening to pitch instead of frequencies.</p><p>The image below contains two frequencies two sine waves, one is at 440 Hertz, the one on top, and then the one on the bottom is at 880 Hertz.</p><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632824102799/YgqsDCXBh.png" alt /></p><p>So there is a two to one relationships, theyre an octave apart from each other. Now what happens if we actually listen to them? Thats all for day 081. I hope you found this informative. Thank you for taking time out of your schedule and allowing me to be your guide on this journey. And until next time, be legendary.</p>]]>https://cdn.hashnode.com/res/hashnode/image/upload/v1632824104410/pK_Xm1IIUi.png