I've been experimenting with using weak pointers to remove unused versions from the history chain. If a sequence versions somewhere in the middle of the chain is unused, their diffs could be combined into an IntMap (see also #13) which has logarithmic time lookup, so this could significantly improve the time it takes to read from old versions of the array.
I tried to start with a simple experiment: implement a list whose elements disappear when they are no longer directly referenced. I think that nicely models the behaviour we want. Unfortunately, it does not work yet and I don't have any more time this weekend. Here's my attempt:
{-# LANGUAGE LambdaCase #-}
import System.Mem.Weak
import System.Mem
import System.IO
import Control.Concurrent
import Data.IORef
import Control.Monad
import Data.Coerce
newtype WeakList a = WL (IORef (WeakListData a))
data WeakListData a = Nil | Cons a !(Weak (WeakList a))
nil :: IO (WeakList a)
nil = WL <$> newIORef Nil
cons :: Show a => a -> WeakList a -> IO (WeakList a)
cons x xs@(WL ref) = do
ref' <- newIORef Nil
w <- readIORef ref >>= \case
Nil -> mkWeakIORef ref (pure ())
Cons y ys -> mkWeakIORef ref $ print y >> writeIORef ref' (Cons x ys)
writeIORef ref' (Cons x (coerce w))
pure (WL ref')
toList :: WeakList a -> IO [a]
toList (WL ref) = readIORef ref >>= \case
Nil -> pure []
Cons x w -> deRefWeak w >>= \case
Nothing -> pure [x]
Just xs -> (x :) <$> toList xs
main = do
hSetBuffering stdout LineBuffering
xs <- cons 3 =<< cons 2 =<< cons (1 :: Int) =<< nil
print =<< toList xs
ys <- cons 6 =<< cons 5 =<< cons 4 xs
print =<< toList ys
performGC
threadDelay 100000
print =<< toList xs
print =<< toList ys
It currently prints:
[3,2,1]
[6,5,4,3,2,1]
2
1
[3]
[6,5,4,3]
I wanted it to print:
[3,2,1]
[6,5,4,3,2,1]
5
4
2
1
[3]
[6,3]
Of course this probably would have way too much overhead, but this could show that it is possible to get the GC to help us in theory.
I've been experimenting with using weak pointers to remove unused versions from the history chain. If a sequence versions somewhere in the middle of the chain is unused, their diffs could be combined into an
IntMap(see also #13) which has logarithmic time lookup, so this could significantly improve the time it takes to read from old versions of the array.I tried to start with a simple experiment: implement a list whose elements disappear when they are no longer directly referenced. I think that nicely models the behaviour we want. Unfortunately, it does not work yet and I don't have any more time this weekend. Here's my attempt:
{-# LANGUAGE LambdaCase #-} import System.Mem.Weak import System.Mem import System.IO import Control.Concurrent import Data.IORef import Control.Monad import Data.Coerce newtype WeakList a = WL (IORef (WeakListData a)) data WeakListData a = Nil | Cons a !(Weak (WeakList a)) nil :: IO (WeakList a) nil = WL <$> newIORef Nil cons :: Show a => a -> WeakList a -> IO (WeakList a) cons x xs@(WL ref) = do ref' <- newIORef Nil w <- readIORef ref >>= \case Nil -> mkWeakIORef ref (pure ()) Cons y ys -> mkWeakIORef ref $ print y >> writeIORef ref' (Cons x ys) writeIORef ref' (Cons x (coerce w)) pure (WL ref') toList :: WeakList a -> IO [a] toList (WL ref) = readIORef ref >>= \case Nil -> pure [] Cons x w -> deRefWeak w >>= \case Nothing -> pure [x] Just xs -> (x :) <$> toList xs main = do hSetBuffering stdout LineBuffering xs <- cons 3 =<< cons 2 =<< cons (1 :: Int) =<< nil print =<< toList xs ys <- cons 6 =<< cons 5 =<< cons 4 xs print =<< toList ys performGC threadDelay 100000 print =<< toList xs print =<< toList ysIt currently prints:
I wanted it to print:
Of course this probably would have way too much overhead, but this could show that it is possible to get the GC to help us in theory.