词频统计-------------web版本

要求：把程序迁移到web平台，通过用户上传TXT的方式接收文件。建议(但不强制要求)保留并维护Console版本，有利于测试。

在页面上设置上传的控件,然后在servlet中接受，得到的是一个字节流，然后转化为字符型在原有代码中进行统计。

jsp页面的代码如下

<%@ page language="java" contentType="text/html; charset=utf-8"

    pageEncoding="utf-8"%>

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

<title>Insert title here</title>

</head>

<body>

 <table>

     <tr>

         <td>

             <form action="server/CountWordServlet" method="post" enctype="multipart/form-data">

             请上传要统计的文件<input type="file" name="sourceFile"/>

                     <input type="submit" value="上传">

             </form>

         </td>

     </tr>

 </table>

</body>

</html>

展示结果的页面如下

<%@page import="com.server.servlet.Word"%>

<%@page import="java.util.ArrayList"%>

<%@ page language="java" contentType="text/html; charset=utf-8"

    pageEncoding="utf-8"%>

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">

<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

<%ArrayList<Word> list=(ArrayList<Word>)request.getAttribute("list"); %>

<title>Insert title here</title>

</head>

<body>

 <table>

             <%

             if(list!=null&&list.size()!=0){

                 %>

                 <tr> <td>单词</td><td>数量</td> </tr>

                 <%

                 for(int i=0;i<list.size();i++){

                      String word=((Word)list.get(i)).getWord();

                      int num=((Word)list.get(i)).getNum();

                      %><tr>

                          <td><%=word%></td>

                          <td><%=num%></td>

                      </tr>

                      <%

                  }

             }else{  %>

                 <td>此文件没有单词或者文件不存在</td>

         <%     }

          %>

 </table>

</body>

</html>

servle中的代码如下

public class CountWordServlet extends HttpServlet {

    private static final long serialVersionUID = 1L;

    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

        try {

        request.setCharacterEncoding("utf-8");

        ArrayList<Word> list=new ArrayList<>();

        DiskFileItemFactory factory=new DiskFileItemFactory();

        ServletFileUpload upload=new ServletFileUpload(factory);

            FileItemIterator iterator=upload.getItemIterator(request);

            while(iterator.hasNext()){

                InputStream input=iterator.next().openStream();

                WordCountFreq wcf=new WordCountFreq();

                list=(ArrayList<Word>) wcf.sortAndOutput(input);

                request.setAttribute("list", list);

            }

        } catch (FileUploadException e) {

            e.printStackTrace();

        }

        System.out.println("成功了！");

        response.setContentType("text/html;charset=utf-8");

          request.getRequestDispatcher("/show.jsp").forward(request, response);

    }

}

然后将统计过程的关键方法sortAndOutput（）展示如下

public List<Word> sortAndOutput(InputStream input) throws IOException {

        BufferedInputStream bis=new BufferedInputStream(input);

        byte [] buf = new byte[1024];

        int len = -1;

     String temp = "";

        String lastWord = "";

        while((len = bis.read(buf)) != -1) {

            //将读取到的字节数据转化为字符串打印出来

            String str = new String(buf,0,len);

             temp = "";

            temp += lastWord;

            for (int i = 0; i < str.length(); i++) {

                temp += str.charAt(i);

            }

            lastWord = "";

            if (Character.isLetter(str.charAt(str.length()-1))) {

                int j, t;

                for (j = str.length() - 1, t = 0; Character.isLetter(str.charAt(j)); j--, t++);

                temp = temp.substring(0, temp.length() - t);

                for (int k = j + 1; k < str.length(); k++) {

                    lastWord += str.charAt(k);

                }

            }

            root = generateCharTree(temp);

        }

示例如下

词频统计-------------web版本

在没做web版本之前，只是传入文件的路径进行处理。改为web版本之后将遇见的一点小困难是要将字节流转化为字符进行处理，经过查询也很快就解决了。

ssh:git@git.coding.net:muziliquan/GUIVersion.git

git:git://git.coding.net/muziliquan/GUIVersion.git

秒客网

词频统计-------------web版本

相关文章