PAT 1145 Hashing - Average Search Time [hash][难]

时间:2023-03-31 19:42:32
1145 Hashing - Average Search Time (25 分)

The task of this problem is simple: insert a sequence of distinct positive integers into a hash table first. Then try to find another sequence of integer keys from the table and output the average search time (the number of comparisons made to find whether or not the key is in the table). The hash function is defined to be H(key)=key%TSize where TSize is the maximum size of the hash table. Quadratic probing (with positive increments only) is used to solve the collisions.

Note that the table size is better to be prime. If the maximum size given by the user is not prime, you must re-define the table size to be the smallest prime number which is larger than the size given by the user.

Input Specification:

Each input file contains one test case. For each case, the first line contains 3 positive numbers: MSize, N, and M, which are the user-defined table size, the number of input numbers, and the number of keys to be found, respectively. All the three numbers are no more than 10​4​​. Then N distinct positive integers are given in the next line, followed by M positive integer keys in the next line. All the numbers in a line are separated by a space and are no more than 10​5​​.

Output Specification:

For each test case, in case it is impossible to insert some number, print in a line X cannot be inserted. where X is the input number. Finally print in a line the average search time for all the M keys, accurate up to 1 decimal place.

Sample Input:

4 5 4
10 6 4 15 11
11 4 15 2

Sample Output:

15 cannot be inserted.
2.8

题目大意:就是用二次探查法解决冲突问题。

//这个题给我干懵了。为啥查询15时要多+1次?好奇怪啊。

学习了哈希中解决冲突的几种办法:

1.二次探查法

首先h=hash(x)=x%maxSize;

探查需要:j在[0.maxSize-1]这个区间内,使用公式:

new=(h+j^2)%maxSize;

在查询时:

如果查到一个=-1也就是没有这个数,那么就停止;

如果j已经到了maxSize-1仍旧没有查到,那么就是未出现在哈希表里。

//不过真的不明白为什么这里要多加1次。

并且正常的探查是需要左右同时进行的,形如1*1.-1*1,2*2,-2*2.....以此类推。

代码转自:https://blog.csdn.net/qq_34594236/article/details/79814881

#include <cstdio>
#include <cstring>
#include <cmath>
using namespace std; bool isPrime(int num) {
if (num < ) return false;
for (int i = ; i <= sqrt(num); i++) {
if (num % i == ) return false;
}
return true;
} int H(int key, int TSize){
return key % TSize;
} int msize, n, m, a, table[];
int main() {
memset(table, -, sizeof(table));
scanf("%d%d%d", &msize, &n, &m); while (isPrime(msize) == false) msize++; for (int i = ; i < n; i++) {
scanf("%d", &a); bool founded = false;
for (int j = ; j < msize; j++) {
int d = j * j;
int tid = (H(a, msize) + d) % msize;
if (table[tid] == -) {
founded = true;
table[tid] = a;
break;
}
}
if (founded == false) {
printf("%d cannot be inserted.\n", a);
}
}
int tot = ; for (int i = ; i < m; i++) {
scanf("%d", &a);
int t = ;
bool founded = false;
for (int j = ; j < msize; j++) {
tot++;
int d = j * j;
int tid = (H(a, msize) + d) % msize;
if (table[tid] == a || table[tid] == -) { // 找到或者不存在
founded = true;
break;
}
}
if(founded ==false) {
tot++;
}
} printf("%.1f\n", tot*1.0/m); return ;
}

//真是学习了。

还有一个非常重要的问题,关于段的,就是定义的数组的长度,如果输入是10000,那么将其转换为最近的素数,那只能是10007,所以最好定义数组长度为10010.