CW pulse consist of a constant frequency and a duration (pulse length).
Range resolution is:
where c is the speed of sound (in air 330m/s).
So to get a range resolution of 2cm, we need a pulse length of 121us.
If sampling frequency is 48000Hz, the CW pulse is approx. 6 samples long.
Pulse length vs. bandwidth
There is an inverse relationship between pulse length and bandwidth of the transmitted signal.
The short pulse, have a wide bandwidth.
Likewise, if pulse is 10 times longer we get
Narrow bandwidth on a longer pulse.
With a narrow bandwidth, will be able to process the signal to only listen to those specific frequencies and filter out noise. Further, sending long pulses require less energy to transmit to reach the same distance. Drawback is that range resolution worse (10x to be exact).
Spectrum has a main lope, the signal we want to process, but it also has a lot of side lopes, which is unwanted signals.
We can manage the side lopes using window functions. There are a lot of different windows function designs, with different properties. Boxcar (rectangular), Hamming, Kaiser, Chebychev to name a few. The latter are pretty good at suppressing side lopes, but the cost is a wider main lope.
Some perspective, a wide main lope will make our objects in the beam formed image wider. Objects will look wider than they are, so a wide main lope is not what we want.
Again, we want narrow beams, good range resolution so we can identify the objects in the image.
An example with a Kaiser window function applied to the transmitted (long) pulse
Side lopes are almost gone, but the main lope is wider.
CW pulses are easy to understand, require less post-processing to generate great looking images. But total range might suffer, as short pulses requires more energy.
Intrapulse modulation (or FM)
If we, instead of transmitting short pulses, transmit longer pulses with a frequency change, a sweep from frequencies F1 to F2, built in, AND post process the signal with range compression (using a match filter), the properties of range resolution is defined as:
where BW is the bandwidth of the transmitted signal. Notice that the range resolution is no longer dependant on pulse length (in theory at least). The larger the BW is, the better range resolution we get.
Example: range resolution of an FM transmitted signal where f1=12000Hz and f2=14000Hz is approx 8.25cm.
Pulse compression is a filter that matches the transmitted pulse with the received data. It filters out data that is not correlated with the pulse, hence increase S/N significantly.
Red pulse is transmitted and when pulse hits objects, the are reflected, the blue echos are returned. When applying the match filter, the impulse response is shortened to the short "blips", seen to the right (hence compression of the pulse).
I real life sonars, data is echos from the objects and surroundings and the compression filter aka match filter, removed every thing else but the matching pulse.
A sonar ping under water, will contain background noise from the surroundings, maybe engine noise from a ship or similar. This is effectively removed so only data echoed from the transmitted pulse remains.
Match filter can be processed in frequency-domain
where MD is the resulting data after match filtering, RD is the raw received data and pulse is the transmitted pulse.
Or in time-domain
In my project, I use the frequency-domain implementation, since its running in CUDA
Above is basically just a FIR filter. You could use the same implementation to remove...
There are a lot of literature that explains the dark-magic around IQ signals, but in essence, together they form a complex number, which can be represented as a real and imaginary value - or magnitude and phase.
I and Q amplitude form a combined amplitude and a phase.
When two waves (complex IQ) are mixed, the phase difference between to two waves determine if resulting wave will undergo constructive or destructive interference.
Given two microphones that records a sound from some angle, the same sound is recorded by both microphones, but with a time delay. The time delay corresponds to the angle from where the sound originated or a difference in phase between the two signals.
Time delay for element j (tde)
where angle is the direction that we want to listen (steering the beam), pde(j) is the phase for element j, and c is the speed of sound
Phase delay for element j (pde) in array:
where ne is the number of elements in the array, d is the element spacing
Sample delay for element j. Note that "center" channel is the ne/2'th element.
Sampling the data from the elements (channels)
When sampling data from the array's elements, we sample real values. The phase is missing from the sampling.
To generate the phase in order for us to generate the IQ data, we need to run the Hilbert transform
where real is the sample data per channel and iq (in timedomain)
Combining all the channels with IQ data, we get a NxM matrix of IQ data, where N (rows) is the number of samples per channel and M (col) is the number of channels.
Beam forming using FFT
Beam forming the data i.e. transforming the data from time domain frequency domain or from channels to beams is simple:
where IQ is in frequency domain and FFT is a 1D transform in the channel direction
Now each col in the matrix represents the frequency data in angle direction.
Angles in the beam formed data:
where i is the i'th beam, c is the speed of sound, ne is the number of elements, d is the element spacing, f is the frequency.
Note that the angle now become dependant of the frequency of the signal. Listening to multifrequency signals like whale song, changes (calculated) angle when whales sing. FFT's are probably not the best way to detect direction when frequency varies
Example:
If we want to calculate the angle for a f=200kHz, where ne=256, d=0.02m for the beam i=100, the angle is -0.51 degrees. Changing the frequency to 220kHz, the angle is -0.47deg. A small change, if we do a chirp.
Using f=2kHz, the angle is -64deg and for f=3kHz, the angle is -37deg. The frequency sweep is impossible to beam form for lower frequencies with larger bandwidth.
Beam forming using CZT transform
Chirp-Z transform or zoom-FFT is a variant of the FFT. It consist of 2x forward FFT and 1x Inverse FFT, so its requires 3x computations. The advantage is that the CZT allows us to specify the angles in the transform, basically letting us zoom into a pre-defined angle-space.
One of the biggest advantages is that the angle calculations are independent of the frequency. Angle space is the same, regardless of the frequency, which is what we want.
Further, we can specify the angles in the angle space we want to use, where as the FFT is pre-set to a specific angle space.
where a1 and a2 are the zoom angles, ne is the number of elements. Notice that angle space is only defined by the zoom angles and number of elements.
Clearly, CZT is the preferred beam former, if FFT's are to be used.
Further, the open angle is worth to revisit.
Open angle specify the outer limits of the a1 and a2, we make sure that we do not conflict with the grating lopes of the beam radiation pattern. This will cause "mirror" effects in the final image.
First test is to send a CW signal and record the response on the 16 channels. The result is beamformed and visualized as a polar-image, where angle-range is plotted. Range is in meters, angels is in deg.
Total range: 30m
Swath is pre-set to the open-angle, as decribed above. Right is the color-scale. Resulting image is built up of 5-pings. Images are stacked. Stacking process removes some of the specke-noise or background noice and amplifies the objects.
The brighter lines are actual objects that reflect the ping. Distance is measured by counting samples and convert to meters. Looking closely at the image, we can see that the objects fade above 15m out. The room I am testing in, is much smaller so everything above 1.5m-2.5m out is multipath or sound reflected off walls and furniture.
Similar distance measurement, like the SR04 is designed to detect objects within the swath and closer than 5m. Above image shows objects more than 15m away, plus where they are.
The CW pulse is short, 100us long, which require more power to transmit in order to get further away.
One solution is to use a different pulse, FM (frequency modulated) where frequency sweeps. Using this in combination with a matchfilter, we can use longer pulses, 0.5ms-3ms, so less power is needed. Result should be that we can detect objects further way that the 15m.
Total range is 50m.
Notice the object just above 35m and -7deg. This is not visible using the CW pulse.
On purpose, I have not interpolated the beams in the above images. This is to highlight how wide the beams actually are and how large the objects are. The 16 channels have been oversampled into 128 beams, which explains why the objects witch seem larger than the beam-width from the beamforming.
The oversampling trick is especially interesting when we apply movement to the sonar, similar to what mobile phones are doing with super-resolution. The use multiple images and align them to get an even higher resolution image.
Since my project is centred around the audible frequency spectrum, I am using "speakers" (sound emitters) and simple microphones.
Transmitters:
Ultrasound transmitter from SR04. They are cheap and works in the 35-45kHz range.
Speakers and amplifier from an old gaming-setup I don't use anymore. They can send sound in the audible spectrum from around 50Hz to 18kHz. Actually, it does transmit 30kHz and even 40kHz, but signal is really weak, but not too weak, as my microphones can pick up the signal.
Receivers:
I use cheap MAX9814 Electret Microphone Amplifier Module. I remove the microphone and use the microphones in an array setup. The amplifier is used to feed sound into my audio interface.
I have designed a very simple 8-channel PCB that wires power, GND and out to the mini-jack connectors. I can then use the boards in 8 channel configurations. 3x boards = 24 channels.
Audio interface is a bit of a mess. I use two EVO16's, one master and one more as a slave. Each one can record 8 channels. The 3rd is just a Behringer ADA-8200 preamp that sends digital audio via the optical connection to the master EVO16. In total I have 24 channels, if I run at 48kHz sampling. I can do 16 channels at 96kHz sampling rate.
The master EVO is connected to my computer, where my processing software is running. Processing is computational heavy. My computer has a Nvidia CUDA processor, so I use the CUDA for heavy lifting the FFT's required to beamform.
At some point, I might replace the expensive audio-interfaces with cheaper analog-digital converters.
Using professional studio quality ADA's gives me more focus on the arrays. and the software. They are low noise, hum-free, have great software with them and interfaces very nicely with computers audio-interfaces.
The custom PCB was designed, only to help reduce wiring of power, ground and out for each of the MAX9814. Also, the preamp has a gain option that needed a wire. So doing this using wires I would need 4 wires per pre-amp times the number of channels, currently 16 but will 24. Thats almost 100 wires, just for connecting the pre-amp to the sampling units.
Below schematics is handling 8 channels. Output ends in a minijack connector so I can use standard audio cables (high quality) to lead to the sampling units.
I used EasyEDA service.
Draw the schematics
Layout the PCB
Select components (the minijack connector), jumper connectors
Order the PCB. Minimum order size i 5. I needed 3 for my project and the remaining two, I used for prototyping and other experiments.
5 days after I submitted my schematics, I received it by currier
PCB layout.
Assembled 8 channel receiver.
The pins on top of the pre-amps are for connecting the microphone arrays. This way, microphone arrays can easily be swapped around.
Sound array can be constructed in infinite amount of ways. It consists of receiver elements, typically microphones or piezo elements. The array can also acts as a transmitter, if the elements can send and receive (sound) waves. Piezo's have an interesting feature. When current is sent through a piezo element, it expands ie. it transforms electrical current into movement, just like a loudspeaker. But if the element is exposed to waves, it generates current, just like a microphone.
A simple array is just a 1D array, where elements are placed with equal distances. The introduction video shows this quite nicely.
There are several properties that influences the way the array works.
Element spacing
Frequency
Sound speed
Above allows us to calculate the time-delay that has to be applied to the signal sent to each of the elements, to effectively send sound in a specific direction.
Likewise, the same time-delay is part of the audio signal that is received by the microphones.
Parameters to consider:
Frequency can only vary slightly, meaning, "ping" sent by the transmitter signal must have a relative narrow band. Example is F=40kHz, like the well known ultrasound transmitter (SR04). However, some pulses can have a frequency sweep (FM), where frequency sweeps from 38kHz to 42kHz.
If frequency varies too much, the angle-phase changes and we cannot derive where sound is coming from
Sound speed can be considered (relatively) constant. 330m/s in air and 1500m/s in water.
Element spacing. There is a "golden rule of thumb". d = wavelength/2, where d can be slightly higher, but that introduces grating lopes in the beam radiation pattern.
Element spacing around 2cm for air and frequency around 10kHz is ok.
Element spacing around 1cm for air and frequency around 20kHz is ok.
Element spacing for sonars, using 400kHz, an element spacing must be around 1-2mm.
Open-angle, meaning the widest angle that we can beamform is oa = wavelength/d (in radians). Example oa=330m/s / 10kHz / 2cm = 165 degrees ie. a swath of +/- 80 degrees.
The more elements, the narrower beams, the better image resolution when we start drawing maps based on sonar pings (imaging sonar). 8 elements are not enough to map, but can be used as detection. 16 elements can show recognisable features. 32 is 2x better than 16 and so on. 256 og 512 generate great looking images. Ultrasound scanners use 40-60 elements.
2D arrays, seen in ultrasound can generate awesome looking 3D images. They could have 60x60 elements and run in GHz frequencies, so beams are ultra narrow, but at short range.
The lower frequency, the longer range. Whales, sending out sound in the 500-5kHz range (audible) can be heard thousands of kilometers away. GHz arrays have a range in cm scale.
For my project i have several arrays:
1D 16 element, d=2.05cm. Works fine in 10kHz frequency
1D 24 element, d=1.02cm. Works fine in 20kHz frequency.
2D 5x5 elements, d=1.02cm.
1D 4 element transmitter, d=1.6cm. Works best under 20kHz. Above we get grating lopes above +/-70 deg. So, at 22kHz, as long as we transmit frequencies 18kHz-22kHz at angles below +/-70 deg, we are good. Beam width is around +/-15 deg. I plan on an experiment where I transmit 4 pulses, in the vertical plane and beam form each one on a 4x 30 deg swath (with some overlapping), hopefully we get a pseudo 3D image using on 2x 1D arrays.
The following array is the 1D 24 channel array.
The beam radiation pattern for the 1D 24 channel array.
Color scale is the intensity. X-axis is the angle and Y-axis is the frequency of the pulse. The lower the frequency, the wider the beam is. The higher frequency, grating lopes starts to appear.
Selecting the frequency to 24kHz, we can plot the intensity
Beam width at 24kHz is around +/- 3 deg. At 1m each beam will be 5cm wide. At 10m beams are 50cm. So, its important to get as narrow a beam as possible if we want to map details on objects at longer distances....