返回std :: string时指针无效(所以说libc)

时间:2022-12-03 19:28:47

I have a member function inside a mmapped-file-consuming class, that looks like this:

我在mmapped-file-consume类中有一个成员函数,如下所示:

std::string Data::GetASCIIZ(OFFSET* offsetp) const
{
  char* str = (char*)_buffer + *offsetp;  // _buffer points to mmap'd file
  *offsetp += strlen(str) + 1;
  return std::string(str);
}

(the type of 'OFFSET' is unsigned long long)

('OFFSET'的类型是unsigned long long)

Its raison d'etre is to (a) return a std::string of the null-terminated C-string that is presumed to exist at offset *offsetp, after (b) advancing the value of *offsetp past the end of said C-string.

它的存在理由是(a)返回一个空终止的C字符串的std ::字符串,该字符串被假定存在于offset * offsetp,之后(b)将* offsetp的值推进到所述C的结尾之后-串。

I call this function in numerous situations, without issue. However, I have recently added a new call to it, that always SIGABRTs in a peculiar way:

我在很多情况下都会调用此函数,没有问题。但是,我最近添加了一个新的调用,它总是以一种特殊的方式使用SIGABRT:

*** glibc detected *** /home/ryan/src/coolapp/out/coolapp: free(): invalid pointer: 0xb7eb165c ***

The above message is followed by a backtrace (culminating in some code within libc.so.6), and a memory map... both of which are ostensibly useful to me somehow, in debugging this issue.

上面的消息之后是一个回溯(最终在libc.so.6中的一些代码)和一个内存映射......在调试这个问题时,这两个对我来说都是表面上有用的。

From debugging with GDB, I've learned that the SIGABRT doesn't actually happen inside my Data::GetASCIIZ method quoted above, but rather within the code that calls it during the right side of an assignment. (So, I presume during the invocation of std::string's copy constructor):

通过使用GDB进行调试,我了解到SIGABRT实际上并没有发生在上面引用的Data :: GetASCIIZ方法中,而是发生在赋值右侧调用它的代码中。 (所以,我在调用std :: string的拷贝构造函数时假设):

[EDIT: updated to dovetail with an expected answer from @WhozCraig]

[编辑:更新与@WhozCraig的预期答案相吻合]

struct stuff
{
  char version;
  std::string sigstring;
  // ...
};

stuff* mystuff = (stuff*)malloc(sizeof(stuff));
// ...
mystuff->sigstring = _data->GetASCIIZ(offsetp);  // SIGABRT HAPPENS AT THIS SCOPE

In this particular situation, the C-string at offset *offsetp happens to be an empty string, but I've verified that that is not consequential by temporarily modifying *offsetp to point to something else from within GDB.

在这种特殊情况下,offset * offsetp处的C字符串恰好是一个空字符串,但我已经通过临时修改* offsetp指向GDB中的其他字符串来验证这不是后果。

My method is marked const because it does not modify any of the internal state of the Data object. I am returning an object that lives on the stack, but I am not doing so by reference, and I expected the copy constructor (in the calling code) to do the right thing before that stack item was destructed.

我的方法被标记为const,因为它不会修改Data对象的任何内部状态。我正在返回一个存在于堆栈中的对象,但我没有通过引用这样做,并且我希望复制构造函数(在调用代码中)在该堆栈项被破坏之前做正确的事情。

I have tried rewriting the GetASCIIZ method to use an explicit local, but that did not help.

我尝试重写GetASCIIZ方法以使用显式本地,但这没有帮助。

Am I missing something?

我错过了什么吗?

In case it is useful, here is the disassembly of the call-during-assignment where this SIGABRT happens. (The '==>' is at the point of the error.)

如果它很有用,这里是在SIGABRT发生时分配调用的反汇编。 ('==>'出现在错误点。)

424         sigstring = _data->GetASCIIZ(offsetp);
   0x0807def1 <+183>:   mov    0x8(%ebp),%eax
   0x0807def4 <+186>:   mov    0x4(%eax),%eax
   0x0807def7 <+189>:   lea    0x4(%eax),%ecx
   0x0807defa <+192>:   lea    -0x18(%ebp),%eax
   0x0807defd <+195>:   mov    0x1c(%ebp),%edx
   0x0807df00 <+198>:   mov    %edx,0x8(%esp)
   0x0807df04 <+202>:   mov    %ecx,0x4(%esp)
   0x0807df08 <+206>:   mov    %eax,(%esp)
   0x0807df0b <+209>:   call   0x809e6ee <Data::GetASCIIZ(unsigned long long*) const>
   0x0807df10 <+214>:   sub    $0x4,%esp
   0x0807df13 <+217>:   mov    -0x14(%ebp),%eax
   0x0807df16 <+220>:   lea    0x4(%eax),%edx
   0x0807df19 <+223>:   lea    -0x18(%ebp),%eax
   0x0807df1c <+226>:   mov    %eax,0x4(%esp)
   0x0807df20 <+230>:   mov    %edx,(%esp)
   0x0807df23 <+233>:   call   0x8049560 <_ZNSsaSEOSs@plt>
   0x0807df28 <+238>:   lea    -0x18(%ebp),%eax
   0x0807df2b <+241>:   mov    %eax,(%esp)
=> 0x0807df2e <+244>:   call   0x80497f0 <_ZNSsD1Ev@plt>
   0x0807e026 <+492>:   lea    -0x18(%ebp),%eax
   0x0807e029 <+495>:   mov    %eax,(%esp)
   0x0807e02c <+498>:   call   0x80497f0 <_ZNSsD1Ev@plt>
   0x0807e031 <+503>:   mov    %ebx,%eax
   0x0807e033 <+505>:   jmp    0x807e046 <CoolClass::SpiffyMethod(unsigned long long, unsigned long long, unsigned long long*)+524>
   0x0807e035 <+507>:   mov    %eax,%ebx

1 个解决方案

#1


-1  

Your sample is as follows.

您的样本如下。

std::string Data::GetASCIIZ(OFFSET* offsetp) const
{
  char* str = (char*)_buffer + *offsetp;  // _buffer points to mmap'd file
  *offsetp += strlen(str) + 1;
  return std::string(str);
}

Shouldn't the return statement return a new STL string?

return语句不应该返回一个新的STL字符串吗?

std::string Data::GetASCIIZ(OFFSET* offsetp) const
{
  char* str = (char*)_buffer + *offsetp;  // _buffer points to mmap'd file
  *offsetp += strlen(str) + 1;
  return new std::string(str);
}

#1


-1  

Your sample is as follows.

您的样本如下。

std::string Data::GetASCIIZ(OFFSET* offsetp) const
{
  char* str = (char*)_buffer + *offsetp;  // _buffer points to mmap'd file
  *offsetp += strlen(str) + 1;
  return std::string(str);
}

Shouldn't the return statement return a new STL string?

return语句不应该返回一个新的STL字符串吗?

std::string Data::GetASCIIZ(OFFSET* offsetp) const
{
  char* str = (char*)_buffer + *offsetp;  // _buffer points to mmap'd file
  *offsetp += strlen(str) + 1;
  return new std::string(str);
}