Probably won't be doing any big coding on the git-annex assistant in the upcoming week, as I'll be traveling and/or slightly ill enough that I can't fully get into flow.


There was a new Yesod release this week, which required minor changes to make the webapp build with it. I managed to keep the old version of Yesod also supported, and plan to keep that working so it can be built with the version of Yesod available in, eg, Linux distributions. TBD how much pain that will involve going forward.


I'm mulling over how to support stopping/pausing transfers. The problem is that if the assistant is running a transfer in one thread, and the webapp is used to cancel it, killing that thread won't necessarily stop the transfer, because, at least in Haskell's thread model, killing a thread does not kill processes started by the thread (like rsync).

So one option is to have the transfer thread run a separate git-annex process, which will run the actual transfer. And killing that process will stop the transfer nicely. However, using a separate git-annex process means a little startup overhead for each file transferred (I don't know if it'd be enough to matter). Also, there's the problem that git-annex is sometimes not installed in PATH (wish I understood why cabal does that), which makes it kind of hard for it to run itself. (It can't simply fork, sadly. See past horrible pain with forking and threads.)

The other option is to change the API for git-annex remotes, so that their storeKey and retrieveKeyFile methods return a pid of the program that they run. When they do run a program.. not all remotes do. This seems like it'd make the code in the remotes hairier, and it is also asking for bugs, when a remote's implementation changes. Or I could go lower-level, and make every place in the utility libraries that forks a process record its pid in a per-thread MVar. Still seems to be asking for bugs.

Oh well, at least git-annex is already crash-safe, so once I figure out how to kill a transfer process, I can kill it safely. :)

What if storeKey, retrieveKeyFile, etc. return an IO () which cancels the operation, if possible? The implementation can be canceled regardless if it uses separate processes or Haskell threads.
Comment by http://claimid.com/strager Sat Aug 11 04:50:52 2012

In fact, making a dedicated data type or some typeclasses may be more appropriate:

class Cancelable a where cancel :: a -> IO ()
class Pauseable a where pause :: a -> IO ()

-- Alternatively:

data Transfer = Transfer { cancel :: IO (), pause :: IO () }

-- Or both!
Comment by http://claimid.com/strager Sat Aug 11 04:55:13 2012

That's the lines I was thinking along, and I even made a throwaway branch with some types. But the problem is reworking all the code to do that. Particularly since lots of the code uses generic utility functions that are reused in other, unrelated places and would have to be modified to pass back cancel actions.

The first case the type checker landed me on when I changed the types was code that downloads an url from the web. Naturally that uses a Utility.Url.download. How to cancel download? Depends on its implementation -- it happens to currently shell out to curl, so you have to kill curl, but it could just as easily have used libcurl (other parts of my Utility.Url library do), and then it would need to fork its own thread. So it's an abstraction layer violation problem.

If I had a month to devote to this one problem, I might manage to come up with some clean solution involving monads, or maybe convert all my code to use conduit or something that might allow managing these effects better. Just a guess..

Comment by http://joeyh.name/ Sat Aug 11 14:41:51 2012

How to cancel download? Depends on its implementation .... So it's an abstraction layer violation problem.

Precisely why I suggested returning something as generic as IO ():

-- Current
download :: URLString -> Headers -> [CommandParam] -> FilePath -> IO Bool

-- Suggestion
data Transfer a = Transfer { run :: IO a, cancel :: IO () }
download :: URLString -> Headers -> [CommandParam] -> FilePath -> Transfer

transfer <- download ...
-- You can pass `cancel transfer` to another thread
-- which you want to be able to cancel the transfer.
run transfer  -- blocking

I realized while writing this that you may not get any result from e.g. a download while it is occurring (because the function is blocking). Maybe that's where a misunderstanding occurred. I separated the concepts of creating a transfer and starting/canceling it.

(My idea is starting to feel a bit object-oriented... ;P)

Comment by http://claimid.com/strager Sat Aug 11 16:08:47 2012
Comments on this page are closed.