自己动手写文件查找，字符串查找，查询jar包等工具

文件查找——搜索当前目录下的文件

知道大概的文件名称，使用

findf FileName

findf.py

import argparse, re, os

from os.path import join

parser = argparse.ArgumentParser()

parser.add_argument('FILENAME', help='file name use regular expression')

parser.add_argument('-e', metavar='EXCLUDE', default=None, help='exclude file name use regular expression')

args = parser.parse_args()

count = 0

for root, dirs, files in os.walk(os.getcwd()):

    for name in files:

        if not re.search(args.FILENAME, name):

            continue

        if args.e and re.search(args.e, name):

            continue

        count += 1

        print(join(root, name))

if count <= 1:

    print('\t %d file found.' % (count))

else:

    print('\t %d files found.' % (count))

对应在 windows 下可以使用这样的命令：

dir /b /s FileName

在 linux 下可以使用这样的命令：

find . -name FileName -print

字符串查找——在当前目录及子目录中搜索文件包含的字符串

比如，看看哪些 java 文件中包含了 "Hello" 字符串

finds .java Hello

finds.py

import os, re, argparse

from os.path import join

parser = argparse.ArgumentParser()

parser.add_argument('-c', metavar='CODING', default='utf-8', help='coding: gbk, utf-8')

parser.add_argument('EXT', help='.java, .txt')

parser.add_argument('REGEX', help='regular expression')

args = parser.parse_args()

for root, dirs, files in os.walk(os.getcwd()):

    for name in files:

        if name[-len(args.EXT):] != args.EXT:

            continue

        count = 0

        inFile = False

        for line in open(join(root, name), encoding=args.c).readlines():

            count += 1

            if re.search(args.REGEX, line):

                print(count, '\t', line, end='')

                inFile = True

        if inFile:

            print(join(root,name))

对应在 windows 下的命令：

findstr /s /c:"Hello" *.java

在 linux 下：

find . -name '*.java' -exec grep 'Hello' \{} \; -print

jar 包查找——查找一个类所在的 jar 包

有时候，需要使用一个类，但不知道它在哪个 jar 包中。

findj ClassName

findj.py

import os, re, sys

from os.path import join

from zipfile import ZipFile

def find_in_all_jars(matchstr, path):

    for root, dirs, files in os.walk(path):

        for name in files:

            if name[-4:] == '.jar' and handle_jar(matchstr, join(root, name)):

                print(join(root, name))

def handle_jar(matchstr, filename):

    isMatch = False

    with ZipFile(filename) as zipfile:

        for name in zipfile.namelist():

            if re.search(matchstr, name):

                print('\t', name)

                isMatch = True

    return isMatch

if len(sys.argv) != 2:

    print('''Usage: findj ClassName.class''')

    sys.exit()

#find_in_all_jars(sys.argv[1], os.getcwd())  # 在当前目录及子目录中所有 jar 包中查

find_in_all_jars(sys.argv[1], 'e:\\data\\repository')

对应的 windows 命令：

for /r %i in (*.jar) do @jar tf %i | find "FastDateFormat" && echo %i

for /r %i in (*.jar) do @jar tf %i | find "FastDateFormat" & echo %i

Linux 命令

find . -name '*.jar' -exec jar tvf \{} \| grep Hello \; -print

字数统计——统计当前目录下的代码

统计当前目录下对应文件的字数，单词数，行数

pwc .java

pwc.py

import os, re, argparse

from os.path import join

parser = argparse.ArgumentParser()

parser.add_argument('-c', metavar='CODING', default='utf-8', help='coding: gbk, utf-8')

parser.add_argument('-d', action='store_true', default=False, help='print detail')

parser.add_argument('EXT', help='.java, .txt')

args = parser.parse_args()

nf = nl = nw = nc = 0

for root, dirs, files in os.walk(os.getcwd()):

    for name in files:

        if name[-len(args.EXT):] != args.EXT:

            continue

        nf += 1

        nnl = nnc = nnw = 0

        try:

            for line in open(join(root, name), encoding=args.c).readlines():

                nnl += 1

                nnc += len(line)

                nnw += len(line.split())

        except:

            print('Passing file error: ' + join(root, name))

        nl += nnl

        nc += nnc

        nw += nnw

        if args.d:

            print("%12d%10d%8d  %s" % (nnc, nnw, nnl, join(root, name)))

print("%12d%10d%8d%10d Files" % (nc, nw, nl, nf))