HMM TOOL

时间:2023-03-09 01:13:04
HMM TOOL

HMM隐马尔科夫模型 MATLAB 工具包对各种数据的处理

HMM 工具包下载地址:
工具包使用说明:
接下来简单叙述一下如何写data
1、data是一维数据、每一组训练样例序列长度一致。
O = 3;
Q = 2; 
prior0 = normalise(rand(Q,1)); 
transmat0 = mk_stochastic(rand(Q,Q));
obsmat0 = mk_stochastic(rand(Q,O));

%Now we sample nex=20 sequences of length T=10 each from this model, to use as training data.

T=10;
nex=20;
data = dhmm_sample(prior0, transmat0, obsmat0, nex, T);

%Here data is 20x10. Now we make a random guess as to what the parameters are,

prior1 = normalise(rand(Q,1)); 
transmat1 = mk_stochastic(rand(Q,Q));
obsmat1 = mk_stochastic(rand(Q,O));

%and improve our guess using 5 iterations of EM...

[LL, prior2, transmat2, obsmat2] = dhmm_em(data, prior1, transmat1, obsmat1, 'max_iter', 5);


loglik = dhmm_logprob(data, prior2, transmat2, obsmat2)
%loglik 即用来预测测试数据的相似程度 越大越相似 0为最大
2、data是多维数据、每一组训练样例序列长度一致。
%Let us generate nex=50 vector-valued sequences of length T=50; each vector has size O=2.

O = 2; 
T = 50; 
nex = 50; 
data = randn(O,T,nex);
%Now let use fit a mixture of M=2 Gaussians for each of the Q=2 states using K-means.
M = 2;
Q = 2; 
left_right = 0; 
 prior0 = normalise(rand(Q,1)); 
transmat0 = mk_stochastic(rand(Q,Q)); 
[mu0, Sigma0] = mixgauss_init(Q*M, reshape(data, [O T*nex]), cov_type);
 mu0 = reshape(mu0, [O Q M]); 
Sigma0 = reshape(Sigma0, [O O Q M]);
mixmat0 = mk_stochastic(rand(Q,M));


%Finally, let us improve these parameter estimates using EM.
[LL, prior1, transmat1, mu1, Sigma1, mixmat1] = mhmm_em(data, prior0, transmat0, mu0, Sigma0, mixmat0, 'max_iter', 2);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
说明这里的数组格式是O*T*nex举个例子这个数组是怎么存的
 data0=[x,y,z];data0 是三维数据,供T*nex行,1~T行为nex=1的数据,T+1~2*T行为nex=2的数                %据,以此类推



 data = randn(O,T,nex);
 index=1;
  for k=1:nex
     for j=1:T
         data(:,j,k)=data0(index,:);
         index=index+1;
    end
end 
%按照上述这样将data0写入data即可
%新的数据查看与这个模型的相似程度,即分类
loglik = mhmm_logprob(data, prior, transmat, mu, Sigma, mixmat);
3、data是多维数据、并且每一组训练样例序列长度一致,即HMM如何处理长度不一致数据。
这种情况还是很常见的,例如采集一组连续语音信号,但每次采集得到的长度(帧数)不一致。
假如数据维度为O维,帧数为T(每一组肯能都不一致),NEX为训练数据数目。
步骤1、按照O*T存成NEX行cell类型数据(这里命名为cell_data),例如我的cell_data截图
HMM TOOL
我的单个数据为8维,供4组训练数据,每一组训练数据取得序列长度不一致。
步骤2、训练代码
    O = 8;%维度
    M = 2;
    Q = 3;
    train_num = 4;
    data =[];


    % initial guess of parameters
   cov_type = 'full';
    % initial guess of parameters
    prior0 = normalise(rand(Q,1));
    transmat0 = mk_stochastic(rand(Q,Q));
    for train_len = 1 : train_num
        data = [data(:, 1 : end), cell_data{train_len}];
    end
    
    [mu0, Sigma0] = mixgauss_init(Q*M, data, cov_type);
    mu0 = reshape(mu0, [O Q M]);
    Sigma0 = reshape(Sigma0, [O O Q M]);
    mixmat0 = mk_stochastic(rand(Q,M));
    [LL, HMM.prior, HMM.transmat, HMM.mu, HMM.Sigma, HMM.mixmat] = ...