是否有适合脚本编写的快速启动Haskell解释器?

时间:2021-12-22 17:19:44

Does anyone know of a quick-starting Haskell interpreter that would be suitable for use in writing shell scripts? Running 'hello world' using Hugs took 400ms on my old laptop and takes 300ms on my current Thinkpad X300. That's too slow for instantaneous response. Times with GHCi are similar.

有没有人知道一个快速启动的Haskell解释器适合用于编写shell脚本?使用Hugs运行'hello world'在我的旧笔记本电脑上运行了400ms,在我当前的Thinkpad X300上运行了300ms。这对于瞬时响应来说太慢了。与GHCi的时代相似。

Functional languages don't have to be slow: both Objective Caml and Moscow ML run hello world in 1ms or less.

功能语言不一定要慢:Objective Caml和Moscow ML在1ms或更短的时间内运行hello world。

Clarification: I am a heavy user of GHC and I know how to use GHCi. I know all about compiling to get things fast. Parsing costs should be completely irrelevant: if ML and OCaml can start 300x faster than GHCi, then there is room for improvement.

澄清:我是GHC的重要用户,我知道如何使用GHCi。我知道所有关于编译以使事情变得快速的事情。解析成本应该完全无关:如果ML和OCaml的启动速度比GHCi快300倍,那么还有改进的余地。

I am looking for

我在寻找

  • The convenience of scripting: one source file, no binary code, same code runs on all platforms
  • 脚本的便利性:一个源文件,没有二进制代码,相同的代码在所有平台上运行

  • Performance comparable to other interpreters, including fast startup and execution for a simple program like

    性能可与其他解释器相媲美,包括快速启动和执行简单程序等

    module Main where
    main = print 33
    

I am not looking for compiled performance for more serious programs. The whole point is to see if Haskell can be useful for scripting.

我不是在寻找更严肃程序的编译性能。重点是看看Haskell是否对脚本有用。

5 个解决方案

#1


Using ghc -e is pretty much equivalent to invoking ghci. I believe that GHC's runhaskell compiles the code to a temporary executable before running it, as opposed to interpreting it like ghc -e/ghci, but I'm not 100% certain.

使用ghc -e几乎等同于调用ghci。我相信GHC的runhaskell在运行它之前会将代码编译为临时可执行文件,而不是像ghc -e / ghci那样解释它,但我并不是100%肯定。

$ time echo 'Hello, world!'
Hello, world!

real    0m0.021s
user    0m0.000s
sys     0m0.000s
$ time ghc -e 'putStrLn "Hello, world!"'
Hello, world!

real    0m0.401s
user    0m0.031s
sys     0m0.015s
$ echo 'main = putStrLn "Hello, world!"' > hw.hs
$ time runhaskell hw.hs
Hello, world!

real    0m0.335s
user    0m0.015s
sys     0m0.015s
$ time ghc --make hw
[1 of 1] Compiling Main             ( hw.hs, hw.o )
Linking hw ...

real    0m0.855s
user    0m0.015s
sys     0m0.015s
$ time ./hw
Hello, world!

real    0m0.037s
user    0m0.015s
sys     0m0.000s

How hard is it to simply compile all your "scripts" before running them?

在运行它们之前简单地编译所有“脚本”有多难?

Edit

Ah, providing binaries for multiple architectures is a pain indeed. I've gone down that road before, and it's not much fun...

啊,为多种架构提供二进制文件确实很痛苦。我以前走过这条路,并没有多大乐趣......

Sadly, I don't think it's possible to make any Haskell compiler's startup overhead any better. The language's declarative nature means that it's necessary to read the entire program first even before trying to typecheck anything, nevermind execution, and then you either suffer the cost of strictness analysis or unnecessary laziness and thunking.

遗憾的是,我认为不可能使任何Haskell编译器的启动开销更好。语言的声明性意味着有必要首先阅读整个程序,甚至在尝试进行类型检查之前,从不执行任何操作,然后您要么承担严格性分析或不必要的懒惰和thunking的成本。

The popular 'scripting' languages (shell, Perl, Python, etc.) and the ML-based languages require only a single pass... well okay, ML requires a static typecheck pass and Perl has this amusing 5-pass scheme (with two of them running in reverse); either way, being procedural means that the compiler/interpreter has a lot easier of a job assembling the bits of the program together.

流行的“脚本”语言(shell,Perl,Python等)和基于ML的语言只需要一次传递......好吧,ML需要静态类型检查传递,而Perl有这种有趣的5遍方案(带有其中两个反向运行);无论哪种方式,程序化意味着编译器/解释器更容易将程序的各个部分组合在一起。

In short, I don't think it's possible to get much better than this. I haven't tested to see if Hugs or GHCi has a faster startup, but any difference there is still faaar away from non-Haskell languages.

简而言之,我认为没有比这更好的了。我没有测试过Hugs或GHCi是否有更快的启动,但是任何差异仍然远离非Haskell语言。

#2


Why not create a script front-end that compiles the script if it hasn't been before or if the compiled version is out of date.

为什么不创建一个编译脚本的脚本前端(如果它以前没有,或者编译后的版本已经过时)。

Here's the basic idea, this code could be improved a lot--search the path rather then assuming everything's in the same directory, handle other file extensions better, etc. Also i'm pretty green at haskell coding (ghc-compiled-script.hs):

这是基本的想法,这个代码可以改进很多 - 搜索路径而不是假设所有内容都在同一个目录中,更好地处理其他文件扩展名等。另外,我在haskell编码时非常环保(ghc-compiled-script。 HS):

import Control.Monad
import System
import System.Directory
import System.IO
import System.Posix.Files
import System.Posix.Process
import System.Process

getMTime f = getFileStatus f >>= return . modificationTime

main = do
  scr : args <- getArgs
  let cscr = takeWhile (/= '.') scr

  scrExists <- doesFileExist scr
  cscrExists <- doesFileExist cscr
  compile <- if scrExists && cscrExists
               then do
                 scrMTime <- getMTime scr
                 cscrMTime <- getMTime cscr
                 return $ cscrMTime <= scrMTime
               else
                   return True

  when compile $ do
         r <- system $ "ghc --make " ++ scr
         case r of
           ExitFailure i -> do
                   hPutStrLn stderr $
                            "'ghc --make " ++ scr ++ "' failed: " ++ show i
                   exitFailure
           ExitSuccess -> return ()

  executeFile cscr False args Nothing

Now we can create scripts such as this (hs-echo.hs):

现在我们可以创建这样的脚本(hs-echo.hs):

#! ghc-compiled-script

import Data.List
import System
import System.Environment

main = do
  args <- getArgs
  putStrLn $ foldl (++) "" $ intersperse " " args

And now running it:

现在运行它:

$ time hs-echo.hs "Hello, world\!"     
[1 of 1] Compiling Main             ( hs-echo.hs, hs-echo.o )
Linking hs-echo ...
Hello, world!
hs-echo.hs "Hello, world!"  0.83s user 0.21s system 97% cpu 1.062 total

$ time hs-echo.hs "Hello, world, again\!"
Hello, world, again!
hs-echo.hs "Hello, world, again!"  0.01s user 0.00s system 60% cpu 0.022 total

#3


If you are really concerned with speed you are going to be hampered by re-parsing the code for every launch. Haskell doesn't need to be run from an interpreter, compile it with GHC and you should get excellent performance.

如果您真的关心速度,那么每次启动时重新解析代码都会受到阻碍。 Haskell不需要从解释器运行,使用GHC编译它,你应该获得出色的性能。

#4


You have two parts to this question:

这个问题有两个部分:

  • you care about performance
  • 你关心表现

  • you want scripting
  • 你想要编写脚本

If you care about performance, the only serious option is GHC, which is very very fast: http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=all

如果你关心性能,唯一严肃的选择是GHC,它非常快:http://shootout.alioth.debian.org/u64q/benchmark.php?test = all&lang = all

If you want something light for Unix scripting, I'd use GHCi. It is about 30x faster than Hugs, but also supports all the libraries on hackage.

如果你想要一些轻量级的Unix脚本,我会使用GHCi。它比Hugs快约30倍,但也支持所有hackage库。

So install GHC now (and get GHCi for free).

所以现在安装GHC(并免费获得GHCi)。

#5


What about having a ghci daemon and a feeder script that takes the script path and location, communicates with the already running ghci process to load and execute the script in the proper directory and pipes the output back to the feeder script for stdout?

如果有一个ghci守护进程和一个接受脚本路径和位置的馈送脚本,与已经运行的ghci进程通信以在正确的目录中加载和执行脚本并将输出传递回stdout的feeder脚本?

Unfortunately, I have no idea how to write something like this, but it seems like it could be really fast judging by the speed of :l in ghci. As it seems most of the cost in runhaskell is in starting up ghci, not parsing and running the script.

不幸的是,我不知道如何写这样的东西,但似乎它可以通过以下的速度来判断:l在ghci中。因为似乎runhaskell的大部分成本都是在启动ghci,而不是解析和运行脚本。

Edit: After some playing around, I found the Hint package (a wrapper around the GHC API) to be of perfect use here. The following code will load the passed in module name (here assumed to be in the same directory) and will execute the main function. Now 'all' that's left is to make it a daemon, and have it accept scripts on a pipe or socket.

编辑:经过一些游戏后,我发现Hint包(GHC API的包装)在这里非常有用。以下代码将加载传入的模块名称(此处假定位于同一目录中)并将执行main函数。现在剩下的'全部'是使它成为一个守护进程,让它接受管道或套接字上的脚本。

import Language.Haskell.Interpreter
import Control.Monad

run = runInterpreter . test

test :: String -> Interpreter ()
test mname = 
  do
    loadModules [mname ++ ".hs"]
    setTopLevelModules [mname]
    res <- interpret "main" (as :: IO())
    liftIO res

Edit2: As far as stdout/err/in go, using this specific GHC trick It looks like it would be possible to redirect the std's of the client program into the feeder program, then into some named pipes (perhaps) that the daemon is connected to at all times, and then have the daemon's stdout back to another named pipe that the feeder program is listening to. Pseudo-example:

编辑2:至于stdout / err / in go,使用这个特定的GHC技巧看起来可以将客户端程序的std重定向到feed程序,然后进入一些命名管道(可能)守护进程连接在任何时候,然后将守护进程的stdout返回到馈线程序正在监听的另一个命名管道。伪例如:

grep ... | feeder my_script.hs | xargs ...
            |   ^---------------- <
            V                      |
         named pipe -> daemon -> named pipe

Here the feeder would be a small compiled harness program to just redirect the std's into and then back out of the daemon and give the name and location of the script to the daemon.

这里的馈线将是一个小的编译线束程序,只是将std重定向到守护进程然后从守护进程返回,并将守护进程的名称和位置提供给守护进程。

#1


Using ghc -e is pretty much equivalent to invoking ghci. I believe that GHC's runhaskell compiles the code to a temporary executable before running it, as opposed to interpreting it like ghc -e/ghci, but I'm not 100% certain.

使用ghc -e几乎等同于调用ghci。我相信GHC的runhaskell在运行它之前会将代码编译为临时可执行文件,而不是像ghc -e / ghci那样解释它,但我并不是100%肯定。

$ time echo 'Hello, world!'
Hello, world!

real    0m0.021s
user    0m0.000s
sys     0m0.000s
$ time ghc -e 'putStrLn "Hello, world!"'
Hello, world!

real    0m0.401s
user    0m0.031s
sys     0m0.015s
$ echo 'main = putStrLn "Hello, world!"' > hw.hs
$ time runhaskell hw.hs
Hello, world!

real    0m0.335s
user    0m0.015s
sys     0m0.015s
$ time ghc --make hw
[1 of 1] Compiling Main             ( hw.hs, hw.o )
Linking hw ...

real    0m0.855s
user    0m0.015s
sys     0m0.015s
$ time ./hw
Hello, world!

real    0m0.037s
user    0m0.015s
sys     0m0.000s

How hard is it to simply compile all your "scripts" before running them?

在运行它们之前简单地编译所有“脚本”有多难?

Edit

Ah, providing binaries for multiple architectures is a pain indeed. I've gone down that road before, and it's not much fun...

啊,为多种架构提供二进制文件确实很痛苦。我以前走过这条路,并没有多大乐趣......

Sadly, I don't think it's possible to make any Haskell compiler's startup overhead any better. The language's declarative nature means that it's necessary to read the entire program first even before trying to typecheck anything, nevermind execution, and then you either suffer the cost of strictness analysis or unnecessary laziness and thunking.

遗憾的是,我认为不可能使任何Haskell编译器的启动开销更好。语言的声明性意味着有必要首先阅读整个程序,甚至在尝试进行类型检查之前,从不执行任何操作,然后您要么承担严格性分析或不必要的懒惰和thunking的成本。

The popular 'scripting' languages (shell, Perl, Python, etc.) and the ML-based languages require only a single pass... well okay, ML requires a static typecheck pass and Perl has this amusing 5-pass scheme (with two of them running in reverse); either way, being procedural means that the compiler/interpreter has a lot easier of a job assembling the bits of the program together.

流行的“脚本”语言(shell,Perl,Python等)和基于ML的语言只需要一次传递......好吧,ML需要静态类型检查传递,而Perl有这种有趣的5遍方案(带有其中两个反向运行);无论哪种方式,程序化意味着编译器/解释器更容易将程序的各个部分组合在一起。

In short, I don't think it's possible to get much better than this. I haven't tested to see if Hugs or GHCi has a faster startup, but any difference there is still faaar away from non-Haskell languages.

简而言之,我认为没有比这更好的了。我没有测试过Hugs或GHCi是否有更快的启动,但是任何差异仍然远离非Haskell语言。

#2


Why not create a script front-end that compiles the script if it hasn't been before or if the compiled version is out of date.

为什么不创建一个编译脚本的脚本前端(如果它以前没有,或者编译后的版本已经过时)。

Here's the basic idea, this code could be improved a lot--search the path rather then assuming everything's in the same directory, handle other file extensions better, etc. Also i'm pretty green at haskell coding (ghc-compiled-script.hs):

这是基本的想法,这个代码可以改进很多 - 搜索路径而不是假设所有内容都在同一个目录中,更好地处理其他文件扩展名等。另外,我在haskell编码时非常环保(ghc-compiled-script。 HS):

import Control.Monad
import System
import System.Directory
import System.IO
import System.Posix.Files
import System.Posix.Process
import System.Process

getMTime f = getFileStatus f >>= return . modificationTime

main = do
  scr : args <- getArgs
  let cscr = takeWhile (/= '.') scr

  scrExists <- doesFileExist scr
  cscrExists <- doesFileExist cscr
  compile <- if scrExists && cscrExists
               then do
                 scrMTime <- getMTime scr
                 cscrMTime <- getMTime cscr
                 return $ cscrMTime <= scrMTime
               else
                   return True

  when compile $ do
         r <- system $ "ghc --make " ++ scr
         case r of
           ExitFailure i -> do
                   hPutStrLn stderr $
                            "'ghc --make " ++ scr ++ "' failed: " ++ show i
                   exitFailure
           ExitSuccess -> return ()

  executeFile cscr False args Nothing

Now we can create scripts such as this (hs-echo.hs):

现在我们可以创建这样的脚本(hs-echo.hs):

#! ghc-compiled-script

import Data.List
import System
import System.Environment

main = do
  args <- getArgs
  putStrLn $ foldl (++) "" $ intersperse " " args

And now running it:

现在运行它:

$ time hs-echo.hs "Hello, world\!"     
[1 of 1] Compiling Main             ( hs-echo.hs, hs-echo.o )
Linking hs-echo ...
Hello, world!
hs-echo.hs "Hello, world!"  0.83s user 0.21s system 97% cpu 1.062 total

$ time hs-echo.hs "Hello, world, again\!"
Hello, world, again!
hs-echo.hs "Hello, world, again!"  0.01s user 0.00s system 60% cpu 0.022 total

#3


If you are really concerned with speed you are going to be hampered by re-parsing the code for every launch. Haskell doesn't need to be run from an interpreter, compile it with GHC and you should get excellent performance.

如果您真的关心速度,那么每次启动时重新解析代码都会受到阻碍。 Haskell不需要从解释器运行,使用GHC编译它,你应该获得出色的性能。

#4


You have two parts to this question:

这个问题有两个部分:

  • you care about performance
  • 你关心表现

  • you want scripting
  • 你想要编写脚本

If you care about performance, the only serious option is GHC, which is very very fast: http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=all

如果你关心性能,唯一严肃的选择是GHC,它非常快:http://shootout.alioth.debian.org/u64q/benchmark.php?test = all&lang = all

If you want something light for Unix scripting, I'd use GHCi. It is about 30x faster than Hugs, but also supports all the libraries on hackage.

如果你想要一些轻量级的Unix脚本,我会使用GHCi。它比Hugs快约30倍,但也支持所有hackage库。

So install GHC now (and get GHCi for free).

所以现在安装GHC(并免费获得GHCi)。

#5


What about having a ghci daemon and a feeder script that takes the script path and location, communicates with the already running ghci process to load and execute the script in the proper directory and pipes the output back to the feeder script for stdout?

如果有一个ghci守护进程和一个接受脚本路径和位置的馈送脚本,与已经运行的ghci进程通信以在正确的目录中加载和执行脚本并将输出传递回stdout的feeder脚本?

Unfortunately, I have no idea how to write something like this, but it seems like it could be really fast judging by the speed of :l in ghci. As it seems most of the cost in runhaskell is in starting up ghci, not parsing and running the script.

不幸的是,我不知道如何写这样的东西,但似乎它可以通过以下的速度来判断:l在ghci中。因为似乎runhaskell的大部分成本都是在启动ghci,而不是解析和运行脚本。

Edit: After some playing around, I found the Hint package (a wrapper around the GHC API) to be of perfect use here. The following code will load the passed in module name (here assumed to be in the same directory) and will execute the main function. Now 'all' that's left is to make it a daemon, and have it accept scripts on a pipe or socket.

编辑:经过一些游戏后,我发现Hint包(GHC API的包装)在这里非常有用。以下代码将加载传入的模块名称(此处假定位于同一目录中)并将执行main函数。现在剩下的'全部'是使它成为一个守护进程,让它接受管道或套接字上的脚本。

import Language.Haskell.Interpreter
import Control.Monad

run = runInterpreter . test

test :: String -> Interpreter ()
test mname = 
  do
    loadModules [mname ++ ".hs"]
    setTopLevelModules [mname]
    res <- interpret "main" (as :: IO())
    liftIO res

Edit2: As far as stdout/err/in go, using this specific GHC trick It looks like it would be possible to redirect the std's of the client program into the feeder program, then into some named pipes (perhaps) that the daemon is connected to at all times, and then have the daemon's stdout back to another named pipe that the feeder program is listening to. Pseudo-example:

编辑2:至于stdout / err / in go,使用这个特定的GHC技巧看起来可以将客户端程序的std重定向到feed程序,然后进入一些命名管道(可能)守护进程连接在任何时候,然后将守护进程的stdout返回到馈线程序正在监听的另一个命名管道。伪例如:

grep ... | feeder my_script.hs | xargs ...
            |   ^---------------- <
            V                      |
         named pipe -> daemon -> named pipe

Here the feeder would be a small compiled harness program to just redirect the std's into and then back out of the daemon and give the name and location of the script to the daemon.

这里的馈线将是一个小的编译线束程序,只是将std重定向到守护进程然后从守护进程返回,并将守护进程的名称和位置提供给守护进程。