
时间:2022-04-30 23:07:18

I'm studying how some programming languages assign memory to structured data (in this case I'm studying arrays).


I'm creating the array as shown here on section 3.


import Data.Array.IO
arr <- newArray (1,10) 37 :: IO (IOArray Int Int) --Sets default to 37

And what I'm trying to do is print each element's memory address, something like this:


Array Start: <dec addr> | <hex addr> --Shows where the array itself is
Array 1: <dec addr> | <hex addr> --Memory address of the first element
Array 2: <dec addr> | <hex addr| --Memory address of the second element

The problem that I have is that I don't know how to get the memory address value for an element in Haskell.


Is there a function similar to Python's id(object) or Ruby's object.object_id?


1 个解决方案



You can use the following snippet which I borrowed from the ghc-heap-view package (it also contains an alternative solution using foreign import prim):

您可以使用我从ghc-heap-view包中借用的以下代码段(它还包含使用外部import prim的替代解决方案):

{-# LANGUAGE MagicHash, BangPatterns #-}

import GHC.Exts

-- A datatype that has the same layout as Word and so can be casted to it.
data Ptr' a = Ptr' a

-- Any is a type to which any type can be safely unsafeCoerced to.
aToWord# :: Any -> Word#
aToWord# a = let !mb = Ptr' a in case unsafeCoerce# mb :: Word of W# addr -> addr

unsafeAddr :: a -> Int
unsafeAddr a = I# (word2Int# (aToWord# (unsafeCoerce# a)))

This works by first wrapping a inside a Ptr' constructor and then casting Ptr' a to Word. Since the a field is represented as a pointer, the resulting word now contains the address of the object. The usual caveats apply: this is unsafe, GHC-specific, breaks referential transparency, etc.




main :: IO ()
main = do
  arr <- newListArray (1,10) [1,2..] :: IO (IOArray Int Int)
  a1  <- readArray arr 1
  a2  <- readArray arr 2
  a1' <- readArray arr 1

  putStrLn $ "a1 : " ++ (show . unsafeAddr $! a1)
  putStrLn $ "a1 : " ++ (show . unsafeAddr $! a1)
  putStrLn $ "a2 : " ++ (show . unsafeAddr $! a2)
  putStrLn $ "a2 : " ++ (show . unsafeAddr $! a2)
  putStrLn $ "a1': " ++ (show . unsafeAddr $! a1')



a1 : 16785657
a1 : 16785657
a2 : 16785709
a2 : 16785709
a1': 16785657

Note that you should use unsafeAddr with $!, otherwise you'll be getting an address of a thunk that will evaluate to a instead of the a object itself:


  let a = 1
      b = 2
      c = a + b

  putStrLn $ "c: " ++ (show . unsafeAddr $ c)
  putStrLn $ "c: " ++ (show . unsafeAddr $! c)
  putStrLn $ "c: " ++ (show . unsafeAddr $! c)



c: 9465024
c: 9467001
c: 9467001



You can use the following snippet which I borrowed from the ghc-heap-view package (it also contains an alternative solution using foreign import prim):

您可以使用我从ghc-heap-view包中借用的以下代码段(它还包含使用外部import prim的替代解决方案):

{-# LANGUAGE MagicHash, BangPatterns #-}

import GHC.Exts

-- A datatype that has the same layout as Word and so can be casted to it.
data Ptr' a = Ptr' a

-- Any is a type to which any type can be safely unsafeCoerced to.
aToWord# :: Any -> Word#
aToWord# a = let !mb = Ptr' a in case unsafeCoerce# mb :: Word of W# addr -> addr

unsafeAddr :: a -> Int
unsafeAddr a = I# (word2Int# (aToWord# (unsafeCoerce# a)))

This works by first wrapping a inside a Ptr' constructor and then casting Ptr' a to Word. Since the a field is represented as a pointer, the resulting word now contains the address of the object. The usual caveats apply: this is unsafe, GHC-specific, breaks referential transparency, etc.




main :: IO ()
main = do
  arr <- newListArray (1,10) [1,2..] :: IO (IOArray Int Int)
  a1  <- readArray arr 1
  a2  <- readArray arr 2
  a1' <- readArray arr 1

  putStrLn $ "a1 : " ++ (show . unsafeAddr $! a1)
  putStrLn $ "a1 : " ++ (show . unsafeAddr $! a1)
  putStrLn $ "a2 : " ++ (show . unsafeAddr $! a2)
  putStrLn $ "a2 : " ++ (show . unsafeAddr $! a2)
  putStrLn $ "a1': " ++ (show . unsafeAddr $! a1')



a1 : 16785657
a1 : 16785657
a2 : 16785709
a2 : 16785709
a1': 16785657

Note that you should use unsafeAddr with $!, otherwise you'll be getting an address of a thunk that will evaluate to a instead of the a object itself:


  let a = 1
      b = 2
      c = a + b

  putStrLn $ "c: " ++ (show . unsafeAddr $ c)
  putStrLn $ "c: " ++ (show . unsafeAddr $! c)
  putStrLn $ "c: " ++ (show . unsafeAddr $! c)



c: 9465024
c: 9467001
c: 9467001