从Java迭代器中随机跳过“X”百分比的单词。

时间:2021-12-06 00:42:29

I have some java code as :

我有一些java代码:

   String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);

   while (tokenizer.hasMoreTokens()) {
         // do someything   
   }

However, I want the code to randomly skip X percentage of tokens.

但是,我希望代码可以随机地跳过X百分比的标记。

Example : If tokens are [a , b , c , d] and skip percentage is 50% Valid execution could be printing any two tokens, say [ b , c ] or [a , d] etc

示例:如果令牌是[a, b, c, d],而跳跃百分比为50%,则有效执行可以打印任意两个令牌,如[b, c]或[a, d]等。

How can I implement it in the simplest manner?

如何以最简单的方式实现它?

4 个解决方案

#1


1  

First Solution:

第一个解决方案:

double percentage = 50.0;
int max = (int)percentage * token.length;

int[] skip = new int[token.length];
int count = 0;
while(count < max)
{
    int rand = rnd.nextInt(token.length);
    if(skip[rand] == 0){
        skip[rand] = 1;
        count++;
    }
}

//Use a for loop to print token where the index of skip is 0, and skip index of those with 1.

You may consider this. Create a 1D array of switches (Can be boolean too). Generate 1D array of random switches with size similar to token length. Print token element if switch of the corresponding index is true, else don't print.

你可以考虑这个。创建一个1D的开关阵列(也可以是布尔值)。生成一维随机开关阵列,其大小与令牌长度相似。如果对应的索引切换为真,则打印令牌元素,否则不打印。


Second solution:

第二个解决方案:

Convert your token of array to an arrayList.
int count = 0, x = 0;

while(printed < max){  //where max is num of elements to be printed

    int rand = rnd.nextInt(2); //generate 2 numbers: 50% chance

    if (rand == 0){
        System.out.println(list.get(x);
        list.remove(x);
        printed ++;
    }
    x++;
}

Roll a probability (e.g. 50% chance) whether to print current element for every iteration. Once element is printed, remove it from list, so you won't print duplicates.

滚动一个概率(例如50%的机会)是否为每次迭代打印当前元素。一旦元素被打印出来,从列表中删除它,这样你就不会打印副本了。


Third solution:

第三个解决方案:

Randomly remove a percentage (e.g. 50%) of elements from your token. Just print the rest. This is probably one of the most straight forward way I can think of.

从您的令牌中随机抽取一个百分比(例如50%)的元素。只是打印休息一下。这可能是我能想到的最直接的方法之一。

#2


2  

first calculate the amount to skip i.e. (.50)*tokens.length (note thats pseudo code)

首先计算跳过的量,即(.50)*令牌。长度(注意这是伪代码)

Then I would create an array of length tokens.length and fill it with the selected amount of 1's and the rest 0's

然后我将创建一个长度标记数组。长度并填入所选的1和其余的0。

i.e. for 50% of 10 [1,1,1,1,1,0,0,0,0,0]

也就是10的50% [1,1,1,1,0,0,0,0]

Then do a simple shuffle algorithm (Random shuffling of an array)

然后做一个简单的洗牌算法(随机打乱数组)

to get something like [0,1,1,0,0,1,0,1,1,0]

得到类似[0,1,1,0,1,0,1,1,0]

Then as you run through your tokenizer loop walk throught this array and check

然后,当您运行在您的tokenizer循环遍历这个数组和检查。

(if thisArray[i]==1){
  print(token);
}

#3


1  

The following uses Floyd's subset selection algorithm to select a random subset of specified size. This may be overkill for a small number of tokens, but it's pretty darned efficient for larger sets.

下面使用Floyd的子集选择算法来选择指定大小的随机子集。对于少量的令牌来说,这可能是多余的,但是对于更大的集合来说,这是非常有效的。

import java.util.HashSet;

public class FloydsSubsetSelection {

   /*
    * Floyd's algorithm to chose a random subset of m integers
    * from a set of n, outcomes are zero-based.
    */
   public static HashSet<Integer> generateMfromN(int m, int n) {
      HashSet<Integer> s = new HashSet<Integer>();
      for (int j = n-m; j < n; ++j) {
         if(! s.add((int)((j+1) * Math.random()))) {
            s.add(j);
         }
      }
      return s;
   }

   public static void main(String[] args) {
      // Stuff the tokens into an array.  I've used chars,
      // but these could be anything you want.  You can also
      // store them in any container which is indexable.
      char[] tokens = {'a', 'b', 'c', 'd', 'e', 'f'};
      int desired_percent = 50;     // change as desired

      // Convert desired percent to a count.  I added 1/2 to cause rounding
      // rather than truncation, change if different behavior is desired.
      int m = (int) (((desired_percent * tokens.length) + 0.5) / 100.0);
      HashSet<Integer> results = generateMfromN(m, tokens.length);
      for (int i: results) {                 // iterate through the generated subset
         System.out.print(tokens[i] + " ");  // to print the selected tokens
      }
      System.out.println();
   }
}

#4


-1  

 String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);
   double percentage = 1.0 / 0.5 // replace 0.5 with the percentage you want
   int x = 0;
   while (tokenizer.hasMoreTokens()) {
         ++x;
         if (x >= percentage) {
              // print here
              x = 0;
         }
   }

#1


1  

First Solution:

第一个解决方案:

double percentage = 50.0;
int max = (int)percentage * token.length;

int[] skip = new int[token.length];
int count = 0;
while(count < max)
{
    int rand = rnd.nextInt(token.length);
    if(skip[rand] == 0){
        skip[rand] = 1;
        count++;
    }
}

//Use a for loop to print token where the index of skip is 0, and skip index of those with 1.

You may consider this. Create a 1D array of switches (Can be boolean too). Generate 1D array of random switches with size similar to token length. Print token element if switch of the corresponding index is true, else don't print.

你可以考虑这个。创建一个1D的开关阵列(也可以是布尔值)。生成一维随机开关阵列,其大小与令牌长度相似。如果对应的索引切换为真,则打印令牌元素,否则不打印。


Second solution:

第二个解决方案:

Convert your token of array to an arrayList.
int count = 0, x = 0;

while(printed < max){  //where max is num of elements to be printed

    int rand = rnd.nextInt(2); //generate 2 numbers: 50% chance

    if (rand == 0){
        System.out.println(list.get(x);
        list.remove(x);
        printed ++;
    }
    x++;
}

Roll a probability (e.g. 50% chance) whether to print current element for every iteration. Once element is printed, remove it from list, so you won't print duplicates.

滚动一个概率(例如50%的机会)是否为每次迭代打印当前元素。一旦元素被打印出来,从列表中删除它,这样你就不会打印副本了。


Third solution:

第三个解决方案:

Randomly remove a percentage (e.g. 50%) of elements from your token. Just print the rest. This is probably one of the most straight forward way I can think of.

从您的令牌中随机抽取一个百分比(例如50%)的元素。只是打印休息一下。这可能是我能想到的最直接的方法之一。

#2


2  

first calculate the amount to skip i.e. (.50)*tokens.length (note thats pseudo code)

首先计算跳过的量,即(.50)*令牌。长度(注意这是伪代码)

Then I would create an array of length tokens.length and fill it with the selected amount of 1's and the rest 0's

然后我将创建一个长度标记数组。长度并填入所选的1和其余的0。

i.e. for 50% of 10 [1,1,1,1,1,0,0,0,0,0]

也就是10的50% [1,1,1,1,0,0,0,0]

Then do a simple shuffle algorithm (Random shuffling of an array)

然后做一个简单的洗牌算法(随机打乱数组)

to get something like [0,1,1,0,0,1,0,1,1,0]

得到类似[0,1,1,0,1,0,1,1,0]

Then as you run through your tokenizer loop walk throught this array and check

然后,当您运行在您的tokenizer循环遍历这个数组和检查。

(if thisArray[i]==1){
  print(token);
}

#3


1  

The following uses Floyd's subset selection algorithm to select a random subset of specified size. This may be overkill for a small number of tokens, but it's pretty darned efficient for larger sets.

下面使用Floyd的子集选择算法来选择指定大小的随机子集。对于少量的令牌来说,这可能是多余的,但是对于更大的集合来说,这是非常有效的。

import java.util.HashSet;

public class FloydsSubsetSelection {

   /*
    * Floyd's algorithm to chose a random subset of m integers
    * from a set of n, outcomes are zero-based.
    */
   public static HashSet<Integer> generateMfromN(int m, int n) {
      HashSet<Integer> s = new HashSet<Integer>();
      for (int j = n-m; j < n; ++j) {
         if(! s.add((int)((j+1) * Math.random()))) {
            s.add(j);
         }
      }
      return s;
   }

   public static void main(String[] args) {
      // Stuff the tokens into an array.  I've used chars,
      // but these could be anything you want.  You can also
      // store them in any container which is indexable.
      char[] tokens = {'a', 'b', 'c', 'd', 'e', 'f'};
      int desired_percent = 50;     // change as desired

      // Convert desired percent to a count.  I added 1/2 to cause rounding
      // rather than truncation, change if different behavior is desired.
      int m = (int) (((desired_percent * tokens.length) + 0.5) / 100.0);
      HashSet<Integer> results = generateMfromN(m, tokens.length);
      for (int i: results) {                 // iterate through the generated subset
         System.out.print(tokens[i] + " ");  // to print the selected tokens
      }
      System.out.println();
   }
}

#4


-1  

 String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);
   double percentage = 1.0 / 0.5 // replace 0.5 with the percentage you want
   int x = 0;
   while (tokenizer.hasMoreTokens()) {
         ++x;
         if (x >= percentage) {
              // print here
              x = 0;
         }
   }