Sound

This is the functionality used to collect audio observations from game play.

record audio

utilizes pyaudio to record the windows stereo mixer, you must be enable it in windows sound settingsstereo-mixer-windowspng.png


source

normalize_audio

 normalize_audio (audio)

This function normalizes audio data.

Parameters: - audio (np.ndarray): The audio data to be normalized.

Returns: - np.ndarray: The normalized audio data.


source

detect_first_over_threshold

 detect_first_over_threshold (tensor, threshold)

This function finds the index of the first element in a tensor that is above a specified threshold.

Parameters: - tensor (np.ndarray): The input tensor. - threshold (float): The threshold value.

Returns: - int: The index of the first element in the tensor above the threshold, or -1 if none are found.


source

get_input_device_index

 get_input_device_index (p, device_name='Stereo Mix (Realtek(R) Audio)')

This function finds the index of a specified audio input device.

Parameters: - p (PyAudio): The PyAudio object. - device_name (str): The name of the device to find the index of. Default is ‘Stereo Mix (Realtek(R) Audio)’.

Returns: - int: The index of the specified audio input device.


source

record_audio

 record_audio (duration, device_name='Stereo Mix (Realtek(R) Audio)',
               normalize=True, verbose=False)

This function records audio from a specified device for a given duration.

Parameters: - duration (int): The duration of the audio recording in seconds. - device_name (str): The name of the device to record audio from. Default is ‘Stereo Mix (Realtek(R) Audio)’. - normalize (bool): Whether to normalize the audio recording or not. Default is True. - verbose (bool): Whether to print the start time of the recording or not. Default is False.

Returns: - SoundSequenceObservation: An object containing the start and end timestamps of the recording, and the audio data.

audio_sequence = record_audio(duration=3, normalize=True)
from IPython.display import Audio
# Plot the waveform of the recorded audio
plt.plot(audio_sequence.data)
plt.show()
Audio(data=audio_sequence.data, rate=44100)