获得对齐内存的最佳跨平台方法

时间:2023-01-24 12:10:13

Here is the code I normally use to get aligned memory with Visual Studio and GCC

下面是我通常使用的代码,用于获得与Visual Studio和GCC一致的内存

inline void* aligned_malloc(size_t size, size_t align) {
    void *result;
    #ifdef _MSC_VER 
    result = _aligned_malloc(size, align);
    #else 
     if(posix_memalign(&result, align, size)) result = 0;
    #endif
    return result;
}

inline void aligned_free(void *ptr) {
    #ifdef _MSC_VER 
        _aligned_free(ptr);
    #else 
      free(ptr);
    #endif

}

Is this code fine in general? I have also seen people use _mm_malloc, _mm_free. In most cases that I want aligned memory it's to use SSE/AVX. Can I use those functions in general? It would make my code a lot simpler.

这段代码一般可以吗?我也见过人们使用_mm_malloc, _mm_free。在大多数情况下,我想要对齐的内存是使用SSE/AVX。我可以用这些函数吗?这会让我的代码简单得多。

Lastly, it's easy to create my own function to align memory (see below). Why then are there so many different common functions to get aligned memory (many of which only work on one platform)?

最后,很容易创建自己的函数来对齐内存(参见下面)。为什么会有那么多不同的通用函数来获得一致的内存(其中许多只在一个平台上工作)?

This code does 16 byte alignment.

此代码执行16字节的对齐。

float* array = (float*)malloc(SIZE*sizeof(float)+15);

// find the aligned position
// and use this pointer to read or write data into array
float* alignedArray = (float*)(((unsigned long)array + 15) & (~0x0F));

// dellocate memory original "array", NOT alignedArray
free(array);
array = alignedArray = 0;

See: http://www.songho.ca/misc/alignment/dataalign.html and How to allocate aligned memory only using the standard library?

参见:http://www.songho.ca/misc/alignment/dataalign.html以及如何仅使用标准库分配对齐的内存?

Edit: In case anyone cares, I got the idea for my aligned_malloc() function from Eigen (Eigen/src/Core/util/Memory.h)

编辑:如果有人关心,我从Eigen (Eigen/src/Core/util/Memory.h)得到了关于align ned_malloc()函数的想法。

Edit: I just discovered that posix_memalign is undefined for MinGW. However, _mm_malloc works for Visual Studio 2012, GCC, MinGW, and the Intel C++ compiler so it seems to be the most convenient solution in general. It also requires using its own _mm_free function, although on some implementations you can pass pointers from _mm_malloc to the standard free / delete.

编辑:我刚刚发现posix_memalign没有为MinGW定义。然而,_mm_malloc适用于Visual Studio 2012、GCC、MinGW和Intel c++编译器,因此它似乎是最方便的解决方案。它还需要使用自己的_mm_free函数,尽管在某些实现中可以将_mm_malloc中的指针传递给标准的free / delete。

5 个解决方案

#1


4  

The first function you propose would indeed work fine.

你提出的第一个功能确实很有用。

Your "homebrew" function also works, but has the drawback that if the value is already aligned, you have just wasted 15 bytes. May not matter sometimes, but the OS may well be able to provide memory that is correctly allocated without any waste (and if it needs to be aligned to 256 or 4096 bytes, you risk wasting a lot of memory by adding "alignment-1" bytes).

您的“homebrew”函数也可以工作,但是有一个缺点,如果值已经对齐,您就浪费了15个字节。有时可能并不重要,但是OS很可能能够提供正确分配的内存,而不会产生任何浪费(如果需要将其对齐到256或4096字节,那么添加“alignon -1”字节将会浪费大量内存)。

#2


8  

As long as you're ok with having to call a special function to do the freeing, your approach is okay. I would do your #ifdefs the other way around though: start with the standards-specified options and fall back to platform-specific ones. For example

只要您不介意调用一个特殊的函数来进行释放,那么您的方法就没有问题。我将用另一种方式处理您的#ifdefs:从标准指定的选项开始,然后回到特定于平台的选项。例如

  1. If __STDC_VERSION__ >= 201112L use aligned_alloc.
  2. 如果__STDC_VERSION__ >= 201112L使用align ned_alloc。
  3. If _POSIX_VERSION >= 200112L use posix_memalign.
  4. 如果_POSIX_VERSION >= 200112L,则使用posix_memalign。
  5. If _MSC_VER is defined, use the Windows stuff.
  6. 如果定义了_MSC_VER,请使用Windows。
  7. ...
  8. If all else fails, just use malloc/free and disable SSE/AVX code.
  9. 如果所有这些都失败了,只需使用malloc/free并禁用SSE/AVX代码。

The problem is harder if you want to be able to pass the allocated pointer to free; that's valid on all the standard interfaces, but not on Windows and not necessarily with the legacy memalign function some unix-like systems have.

如果您想要将分配的指针传递给free,那么问题就更难解决了;这在所有的标准接口上都是有效的,但在Windows上是无效的,在一些类unix系统中不一定使用遗留memalign函数。

#3


2  

Here is a fixed of user2093113's sample, the direct code didn't build for me (void* unknown size). I also put it in a template class overriding operator new/delete so you don't have to do the allocation and call placement new.

这是user2093113的一个固定样本,直接代码没有为我构建(void*未知大小)。我还将它放入模板类重写操作符new/delete中,这样您就不必执行分配并调用placement new。

#include <memory>

template<std::size_t Alignment>
class Aligned
{
public:
    void* operator new(std::size_t size)
    {
        std::size_t space = size + (Alignment - 1);
        void *ptr = malloc(space + sizeof(void*));
        void *original_ptr = ptr;

        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes += sizeof(void*);
        ptr = static_cast<void*>(ptr_bytes);

        ptr = std::align(Alignment, size, ptr, space);

        ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);
        std::memcpy(ptr_bytes, &original_ptr, sizeof(void*));

        return ptr;
    }

    void operator delete(void* ptr)
    {
        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);

        void *original_ptr;
        std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

        std::free(original_ptr);
    }
};

Use it like this :

像这样使用它:

class Camera : public Aligned<16>
{
};

Didn't test the cross-platform-ness of this code yet.

还没有测试代码的跨平台性。

#4


1  

If you compiler supports it, C++11 adds a std::align function to do runtime pointer alignment. You could implement your own malloc/free like this (untested):

如果您的编译器支持它,那么c++ 11添加了一个std::align函数来执行运行时指针对齐。您可以实现自己的malloc/free(未经测试):

template<std::size_t Align>
void *aligned_malloc(std::size_t size)
{
    std::size_t space = size + (Align - 1);
    void *ptr = malloc(space + sizeof(void*));
    void *original_ptr = ptr;

    char *ptr_bytes = static_cast<char*>(ptr);
    ptr_bytes += sizeof(void*);
    ptr = static_cast<void*>(ptr_bytes);

    ptr = std::align(Align, size, ptr, space);

    ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);
    std::memcpy(ptr_bytes, original_ptr, sizeof(void*));

    return ptr;
}

void aligned_free(void* ptr)
{
    void *ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);

    void *original_ptr;
    std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

    std::free(original_ptr);
}

Then you don't have to keep the original pointer value around to free it. Whether this is 100% portable I'm not sure, but I hope someone will correct me if not!

然后你不需要保留原来的指针值来释放它。我不确定这是不是100%便携,但我希望有人能纠正我!

#5


0  

Here are my 2 cents:

这是我的2分:

temp = new unsigned char*[num];
AlignedBuffers = new unsigned char*[num];
for (int i = 0; i<num; i++)
{
    temp[i] = new  unsigned char[bufferSize +15];
    AlignedBuffers[i] = reinterpret_cast<unsigned char*>((reinterpret_cast<size_t>
                        (temp[i% num]) + 15) & ~15);// 16 bit alignment in preperation for SSE
}

#1


4  

The first function you propose would indeed work fine.

你提出的第一个功能确实很有用。

Your "homebrew" function also works, but has the drawback that if the value is already aligned, you have just wasted 15 bytes. May not matter sometimes, but the OS may well be able to provide memory that is correctly allocated without any waste (and if it needs to be aligned to 256 or 4096 bytes, you risk wasting a lot of memory by adding "alignment-1" bytes).

您的“homebrew”函数也可以工作,但是有一个缺点,如果值已经对齐,您就浪费了15个字节。有时可能并不重要,但是OS很可能能够提供正确分配的内存,而不会产生任何浪费(如果需要将其对齐到256或4096字节,那么添加“alignon -1”字节将会浪费大量内存)。

#2


8  

As long as you're ok with having to call a special function to do the freeing, your approach is okay. I would do your #ifdefs the other way around though: start with the standards-specified options and fall back to platform-specific ones. For example

只要您不介意调用一个特殊的函数来进行释放,那么您的方法就没有问题。我将用另一种方式处理您的#ifdefs:从标准指定的选项开始,然后回到特定于平台的选项。例如

  1. If __STDC_VERSION__ >= 201112L use aligned_alloc.
  2. 如果__STDC_VERSION__ >= 201112L使用align ned_alloc。
  3. If _POSIX_VERSION >= 200112L use posix_memalign.
  4. 如果_POSIX_VERSION >= 200112L,则使用posix_memalign。
  5. If _MSC_VER is defined, use the Windows stuff.
  6. 如果定义了_MSC_VER,请使用Windows。
  7. ...
  8. If all else fails, just use malloc/free and disable SSE/AVX code.
  9. 如果所有这些都失败了,只需使用malloc/free并禁用SSE/AVX代码。

The problem is harder if you want to be able to pass the allocated pointer to free; that's valid on all the standard interfaces, but not on Windows and not necessarily with the legacy memalign function some unix-like systems have.

如果您想要将分配的指针传递给free,那么问题就更难解决了;这在所有的标准接口上都是有效的,但在Windows上是无效的,在一些类unix系统中不一定使用遗留memalign函数。

#3


2  

Here is a fixed of user2093113's sample, the direct code didn't build for me (void* unknown size). I also put it in a template class overriding operator new/delete so you don't have to do the allocation and call placement new.

这是user2093113的一个固定样本,直接代码没有为我构建(void*未知大小)。我还将它放入模板类重写操作符new/delete中,这样您就不必执行分配并调用placement new。

#include <memory>

template<std::size_t Alignment>
class Aligned
{
public:
    void* operator new(std::size_t size)
    {
        std::size_t space = size + (Alignment - 1);
        void *ptr = malloc(space + sizeof(void*));
        void *original_ptr = ptr;

        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes += sizeof(void*);
        ptr = static_cast<void*>(ptr_bytes);

        ptr = std::align(Alignment, size, ptr, space);

        ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);
        std::memcpy(ptr_bytes, &original_ptr, sizeof(void*));

        return ptr;
    }

    void operator delete(void* ptr)
    {
        char *ptr_bytes = static_cast<char*>(ptr);
        ptr_bytes -= sizeof(void*);

        void *original_ptr;
        std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

        std::free(original_ptr);
    }
};

Use it like this :

像这样使用它:

class Camera : public Aligned<16>
{
};

Didn't test the cross-platform-ness of this code yet.

还没有测试代码的跨平台性。

#4


1  

If you compiler supports it, C++11 adds a std::align function to do runtime pointer alignment. You could implement your own malloc/free like this (untested):

如果您的编译器支持它,那么c++ 11添加了一个std::align函数来执行运行时指针对齐。您可以实现自己的malloc/free(未经测试):

template<std::size_t Align>
void *aligned_malloc(std::size_t size)
{
    std::size_t space = size + (Align - 1);
    void *ptr = malloc(space + sizeof(void*));
    void *original_ptr = ptr;

    char *ptr_bytes = static_cast<char*>(ptr);
    ptr_bytes += sizeof(void*);
    ptr = static_cast<void*>(ptr_bytes);

    ptr = std::align(Align, size, ptr, space);

    ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);
    std::memcpy(ptr_bytes, original_ptr, sizeof(void*));

    return ptr;
}

void aligned_free(void* ptr)
{
    void *ptr_bytes = static_cast<void*>(ptr);
    ptr_bytes -= sizeof(void*);

    void *original_ptr;
    std::memcpy(&original_ptr, ptr_bytes, sizeof(void*));

    std::free(original_ptr);
}

Then you don't have to keep the original pointer value around to free it. Whether this is 100% portable I'm not sure, but I hope someone will correct me if not!

然后你不需要保留原来的指针值来释放它。我不确定这是不是100%便携,但我希望有人能纠正我!

#5


0  

Here are my 2 cents:

这是我的2分:

temp = new unsigned char*[num];
AlignedBuffers = new unsigned char*[num];
for (int i = 0; i<num; i++)
{
    temp[i] = new  unsigned char[bufferSize +15];
    AlignedBuffers[i] = reinterpret_cast<unsigned char*>((reinterpret_cast<size_t>
                        (temp[i% num]) + 15) & ~15);// 16 bit alignment in preperation for SSE
}