分层音频文件Java时的峰值剪裁

Question

因此，作为我正在研究的项目的一部分，我正在尝试将多个音频剪辑叠加在一起以创建人群的声音，并将其写入新的.WAV文件。

首先，我创建一个文件的byte []表示（一个16位PCM .WAV文件），这似乎不会导致任何问题。

public byte[] toByteArray(File file)
{
    try
    {
        AudioInputStream in = AudioSystem.getAudioInputStream(file);

        byte[] byteArray = new byte[(int) file.length()];//make sure the size is correct

        while (in.read(byteArray) != -1) ;//read in byte by byte until end of audio input stream reached

        return byteArray;//return the new byte array
    }

然后，我创建一个缓冲区（一个整数数组，以防止在添加字节时字节溢出）并尝试在我的文件的字节数组版本中分层。

 int[] buffer = new int[bufferLength];//buffer of appropriate length
        int offset = 0;//no offset for the very first file

        while(!convertedFiles.isEmpty())//until every sample has been added
        {
            byte[] curr = convertedFiles.pop();//get a sample from list

            if(curr.length+offset < bufferLength)
            {
                for (int i =0; i < curr.length; i++)
                {
                    buffer[i] += curr[i];
                }
            }

           offset = randomiseOffset();//next sample placed in a random location in the buffer
        }

当我尝试实现一种随机偏移时出现问题。我可以将所有音频从索引0（缓冲区[0]）添加到我的缓冲区中，所以一切都可以一次播放，并且可以正常播放。但是，如果我尝试在整个缓冲区中随机分散各个剪辑，我会遇到问题。

当我尝试偏移添加文件时，相对于缓冲区的长度，我得到了可怕的静态和峰值削波。

 buffer[i+offset] += curr[i];

我意识到我需要小心避免溢出，这就是为什么我尝试使用整数缓冲区而不是字节1。

我不明白的是，为什么它只会在我引入抵消时中断。

我没有发布实际使用AudioSystem对象的代码来创建一个新文件，因为它似乎没有任何影响。

这是我第一次使用音频编程，因此非常感谢任何帮助。

编辑：

Hendrik的答案解决了我的问题，但我只需稍微更改建议的代码（某些类型转换问题）：

    private static short byteToShortLittleEndian(final byte[] buf, final int offset)
{
    int sample = (buf[offset] & 0xff) + ((buf[offset+1] & 0xff) << 8);
    return (short)sample;
}

private static byte[] shortToByteLittleEndian(final short[] samples, final int offset)
{
    byte[] buf = new byte[2];
    int sample = samples[offset];
    buf[0] = (byte) (sample & 0xFF);
    buf[1] = (byte) ((sample >> 8) & 0xFF);
    return buf;
}

Answer 1

你的randomiseOffset()方法是什么样的？是否考虑到每个音频样本长度为两个字节？如果randomiseOffset()给你奇数偏移，你最终会将一个样本的低字节与另一个样本的高字节混合，这听起来像（通常是可怕的）噪声。也许这就是你认为是剪辑的声音。

要做到这一点，您需要先解码音频，即考虑采样长度（2个字节）和通道数（？），进行操作，然后再将音频编码为字节流。

假设您只有一个通道，字节顺序为little-endian。然后你将两个字节解码为一个样本值，如下所示：

private static int byteToShortLittleEndian(final byte[] buf, final int offset) {
    int sample = (buf[offset] & 0xff) + ((buf[offset+1] & 0xff) << 8);
    return (short)sample;
}

要进行编码，您可以使用以下内容：

private static byte[] shortToByteLittleEndian(final int[] samples, final int offset) {
    byte[] buf = new byte[2];
    int sample = samples[offset];
    buf[0] = sample & 0xFF;
    buf[1] = (sample >> 8) & 0xFF;
    return buf;
}

以下是在您的情况下使用这两种方法的方法：

byte[] byteArray = ...;  // your array
// DECODE: convert to sample values
int[] samples = byteArray.length / 2;
for (int i=0; i<samples.length; i++) {
    samples[i] = byteToShortLittleEndian(byteArray, i*2);
}
// now do your manipulation on the samples array
[...]
// ENCODE: convert back to byte values
byte[] byteOut = new byte[byteArray.length];
for (int i=0; i<samples.length; i++) {
    byte[] b = shortToByteLittleEndian(samples, i);
    byteOut[2*i] = b[0];
    byteOut[2*i+1] = b[1];
}
// do something with byteOut ...

（请注意，您可以通过批量解码/编码轻松地提高效率，而不是如上所示处理单个样本。我只是觉得它更容易理解。）

在操作过程中，您必须注意您的样本值。它们不得大于Short.MAX_VALUE或小于Short.MIN_VALUE。如果您检测到您在有效范围之外，只需缩放整个阵列即可。这样你就可以避免剪裁。

祝好运！

分层音频文件Java时的峰值剪裁

问题描述投票：1回答：1

1个回答

最新问题

分层音频文件Java时的峰值剪裁

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1