查找字符串中所有出现的字符

时间:2022-01-13 15:03:00

I have comma delimited strings I need to pull values from. The problem is these strings will never be a fixed size. So I decided to iterate through the groups of commas and read what is in between. In order to do that I made a function that returns every occurrence's position in a sample string.

我有逗号分隔的字符串,我需要从中提取值。问题是这些字符串永远不会是固定大小。所以我决定迭代逗号组并阅读其中的内容。为了做到这一点,我创建了一个函数,它返回样本字符串中每个匹配项的位置。

Is this a smart way to do it? Is this considered bad code?

这是一个聪明的方法吗?这被认为是错误的代码?

#include <string>
#include <iostream>
#include <vector>
#include <Windows.h>

using namespace std;

vector<int> findLocation(string sample, char findIt);

int main()
{
    string test = "19,,112456.0,a,34656";
    char findIt = ',';

    vector<int> results = findLocation(test,findIt);
    return 0;
}

vector<int> findLocation(string sample, char findIt)
{
    vector<int> characterLocations;
    for(int i =0; i < sample.size(); i++)
        if(sample[i] == findIt)
            characterLocations.push_back(sample[i]);

    return characterLocations;
}

5 个解决方案

#1


11  

vector<int> findLocation(string sample, char findIt)
{
    vector<int> characterLocations;
    for(int i =0; i < sample.size(); i++)
        if(sample[i] == findIt)
            characterLocations.push_back(sample[i]);

    return characterLocations;
}

As currently written, this will simply return a vector containing the int representations of the characters themselves, not their positions, which is what you really want, if I read your question correctly.

正如目前所写,如果我正确地阅读你的问题,这将简单地返回一个包含字符本身的int表示的向量,而不是它们的位置,这正是你真正想要的。

Replace this line:

替换此行:

characterLocations.push_back(sample[i]);

with this line:

用这一行:

characterLocations.push_back(i);

And that should give you the vector you want.

这应该给你你想要的矢量。

#2


6  

If I were reviewing this, I would see this and assume that what you're really trying to do is tokenize a string, and there's already good ways to do that.

如果我正在审查这个,我会看到这一点,并假设你真正想做的是标记一个字符串,并且已经有很好的方法来做到这一点。

Best way I've seen to do this is with boost::tokenizer. It lets you specify how the string is delimited and then gives you a nice iterator interface to iterate through each value.

我见过这样做的最好方法是使用boost :: tokenizer。它允许您指定字符串的分隔方式,然后为您提供一个很好的迭代器接口来迭代每个值。

using namespace boost;
string sample = "Hello,My,Name,Is,Doug";
escaped_list_seperator<char> sep("" /*escape char*/, ","/*seperator*/, "" /*quotes*/)

tokenizer<escaped_list_seperator<char> > myTokens(sample, sep)

//iterate through the contents
for (tokenizer<escaped_list_seperator<char>>::iterator iter = myTokens.begin();
     iter != myTokens.end();
     ++iter)
{
    std::cout << *iter << std::endl;
}

Output:

Hello
My
Name
Is
Doug

Edit If you don't want a dependency on boost, you can also use getline with an istringstream as in this answer. To copy somewhat from that answer:

编辑如果你不想依赖boost,你也可以在这个答案中使用带有istringstream的getline。要从那个答案中复制一下:

std::string str = "Hello,My,Name,Is,Doug";
std::istringstream stream(str);
std::string tok1;

while (stream)
{
    std::getline(stream, tok1, ',');
    std::cout << tok1 << std::endl;
}

Output:

 Hello
 My
 Name
 Is
 Doug

This may not be directly what you're asking but I think it gets at your overall problem you're trying to solve.

这可能不是你要问的直接问题,但我认为它会解决你想要解决的整体问题。

#3


0  

Looks good to me too, one comment is with the naming of your variables and types. You call the vector you are going to return characterLocations which is of type int when really you are pushing back the character itself (which is type char) not its location. I am not sure what the greater application is for, but I think it would make more sense to pass back the locations. Or do a more cookie cutter string tokenize.

对我来说也很好看,一个评论是你的变量和类型的命名。当你真正推回字符本身(类型为char)而不是它的位置时,你调用你将要返回的类型为int的characterLocations的向量。我不确定更大的应用程序是什么,但我认为传回位置会更有意义。或者做一个更多的cookie切割器字符串标记。

#4


0  

How smart it is also depends on what you do with those subtstrings delimited with commas. In some cases it may be better (e.g. faster, with smaller memory requirements) to avoid searching and splitting and just parse and process the string at the same time, possibly using a state machine.

它的智能程度也取决于你用逗号分隔的那些子字符串。在某些情况下,可能更好(例如更快,具有更小的内存要求)以避免搜索和拆分,并且可能同时解析和处理字符串,可能使用状态机。

#5


0  

Well if your purpose is to find the indices of occurrences the following code will be more efficient as in c++ giving objects as parameters causes the objects to be copied which is insecure and also less efficient. Especially returning a vector is the worst possible practice in this case that's why giving it as a argument reference will be much better.

好吧,如果您的目的是找到出现的索引,则以下代码将更有效,因为在c ++中将对象作为参数导致对象被复制,这是不安全且效率较低的。特别是返回一个向量是这种情况下最糟糕的做法,这就是为什么将它作为参数引用会更好。

#include <string>
#include <iostream>
#include <vector>
#include <Windows.h>

using namespace std;

vector<int> findLocation(string sample, char findIt);

int main()
{

    string test = "19,,112456.0,a,34656";
    char findIt = ',';

    vector<int> results;
    findLocation(test,findIt, results);
    return 0;
}

void findLocation(const string& sample, const char findIt, vector<int>& resultList)
{
    const int sz = sample.size();

    for(int i =0; i < sz; i++)
    {
        if(sample[i] == findIt)
        {
            resultList.push_back(i);
        }
    }
}

#1


11  

vector<int> findLocation(string sample, char findIt)
{
    vector<int> characterLocations;
    for(int i =0; i < sample.size(); i++)
        if(sample[i] == findIt)
            characterLocations.push_back(sample[i]);

    return characterLocations;
}

As currently written, this will simply return a vector containing the int representations of the characters themselves, not their positions, which is what you really want, if I read your question correctly.

正如目前所写,如果我正确地阅读你的问题,这将简单地返回一个包含字符本身的int表示的向量,而不是它们的位置,这正是你真正想要的。

Replace this line:

替换此行:

characterLocations.push_back(sample[i]);

with this line:

用这一行:

characterLocations.push_back(i);

And that should give you the vector you want.

这应该给你你想要的矢量。

#2


6  

If I were reviewing this, I would see this and assume that what you're really trying to do is tokenize a string, and there's already good ways to do that.

如果我正在审查这个,我会看到这一点,并假设你真正想做的是标记一个字符串,并且已经有很好的方法来做到这一点。

Best way I've seen to do this is with boost::tokenizer. It lets you specify how the string is delimited and then gives you a nice iterator interface to iterate through each value.

我见过这样做的最好方法是使用boost :: tokenizer。它允许您指定字符串的分隔方式,然后为您提供一个很好的迭代器接口来迭代每个值。

using namespace boost;
string sample = "Hello,My,Name,Is,Doug";
escaped_list_seperator<char> sep("" /*escape char*/, ","/*seperator*/, "" /*quotes*/)

tokenizer<escaped_list_seperator<char> > myTokens(sample, sep)

//iterate through the contents
for (tokenizer<escaped_list_seperator<char>>::iterator iter = myTokens.begin();
     iter != myTokens.end();
     ++iter)
{
    std::cout << *iter << std::endl;
}

Output:

Hello
My
Name
Is
Doug

Edit If you don't want a dependency on boost, you can also use getline with an istringstream as in this answer. To copy somewhat from that answer:

编辑如果你不想依赖boost,你也可以在这个答案中使用带有istringstream的getline。要从那个答案中复制一下:

std::string str = "Hello,My,Name,Is,Doug";
std::istringstream stream(str);
std::string tok1;

while (stream)
{
    std::getline(stream, tok1, ',');
    std::cout << tok1 << std::endl;
}

Output:

 Hello
 My
 Name
 Is
 Doug

This may not be directly what you're asking but I think it gets at your overall problem you're trying to solve.

这可能不是你要问的直接问题,但我认为它会解决你想要解决的整体问题。

#3


0  

Looks good to me too, one comment is with the naming of your variables and types. You call the vector you are going to return characterLocations which is of type int when really you are pushing back the character itself (which is type char) not its location. I am not sure what the greater application is for, but I think it would make more sense to pass back the locations. Or do a more cookie cutter string tokenize.

对我来说也很好看,一个评论是你的变量和类型的命名。当你真正推回字符本身(类型为char)而不是它的位置时,你调用你将要返回的类型为int的characterLocations的向量。我不确定更大的应用程序是什么,但我认为传回位置会更有意义。或者做一个更多的cookie切割器字符串标记。

#4


0  

How smart it is also depends on what you do with those subtstrings delimited with commas. In some cases it may be better (e.g. faster, with smaller memory requirements) to avoid searching and splitting and just parse and process the string at the same time, possibly using a state machine.

它的智能程度也取决于你用逗号分隔的那些子字符串。在某些情况下,可能更好(例如更快,具有更小的内存要求)以避免搜索和拆分,并且可能同时解析和处理字符串,可能使用状态机。

#5


0  

Well if your purpose is to find the indices of occurrences the following code will be more efficient as in c++ giving objects as parameters causes the objects to be copied which is insecure and also less efficient. Especially returning a vector is the worst possible practice in this case that's why giving it as a argument reference will be much better.

好吧,如果您的目的是找到出现的索引,则以下代码将更有效,因为在c ++中将对象作为参数导致对象被复制,这是不安全且效率较低的。特别是返回一个向量是这种情况下最糟糕的做法,这就是为什么将它作为参数引用会更好。

#include <string>
#include <iostream>
#include <vector>
#include <Windows.h>

using namespace std;

vector<int> findLocation(string sample, char findIt);

int main()
{

    string test = "19,,112456.0,a,34656";
    char findIt = ',';

    vector<int> results;
    findLocation(test,findIt, results);
    return 0;
}

void findLocation(const string& sample, const char findIt, vector<int>& resultList)
{
    const int sz = sample.size();

    for(int i =0; i < sz; i++)
    {
        if(sample[i] == findIt)
        {
            resultList.push_back(i);
        }
    }
}