Search Haskell Channel Logs

Friday, February 10, 2017

#haskell channel featuring lambdabot, biglama, ondraa, lyxia, alexbiehl, insitu,

tsahyt 2017-02-10 00:45:15
for libraries I mean
tsahyt 2017-02-10 00:46:03
biglama: have you had a look at RAM usage?
biglama 2017-02-10 00:47:01
tsahyt: is hp2ps the tool for that ?
tsahyt 2017-02-10 00:47:15
for a start, I usually just run it with +RTS -s
alexbiehl 2017-02-10 00:47:19
Grisha, people argue that having ExceptT Err IO a is an anti pattern because
alexbiehl 2017-02-10 00:47:25
checkout: https://www.fpcomplete.com/blog/2016/11/exceptions-best-practices-haskell
tsahyt 2017-02-10 00:47:40
biglama: big red flags are when it uses an unreasonably large amount of RAM or the productivity value is very low
Grisha 2017-02-10 00:47:55
alexbiehl: that exactly answers my question
Grisha 2017-02-10 00:48:37
alexbiehl: basically I want it just to be able to play around in repl, having a monad around my Statements and poking them by <$>
tsahyt 2017-02-10 00:49:56
biglama: do you have some test data somewhere that you can share? I'd like to try rewriting some parts of it in an applicative style and see whether that helps
biglama 2017-02-10 00:51:05
tsahyt: I watched htop during the execution but the ram usage was very low
tsahyt 2017-02-10 00:51:11
hmm okay
alexbiehl 2017-02-10 00:51:53
You could hide IO in your monad and make sure to convert any exception occuring to your explicit error type
biglama 2017-02-10 00:51:54
tsahyt: sure. I have a small test case (10 000, 60s on my computer with profiling)
biglama 2017-02-10 00:52:03
10 000 lines*
biglama 2017-02-10 00:52:34
thanks for your time btw. I'm a little bit at loss here
tsahyt 2017-02-10 00:52:46
yeah, so am I with my project, so I can use a little distraction
biglama 2017-02-10 00:52:49
I also posted on the haskell-beginners mailing list out of desperation
biglama 2017-02-10 00:52:55
despair*
biglama 2017-02-10 00:55:17
tsahyt: https://drive.google.com/open?id=0B6BoMOZHCeZESC1Pb05KNXQ0V28
tsahyt 2017-02-10 00:55:23
biglama: I see that you often parse something, give it a name and then don't use it
biglama 2017-02-10 00:55:24
if google drive is okay for you
tsahyt 2017-02-10 00:55:38
google drive is ok
tsahyt 2017-02-10 00:55:57
e.g. in the time parser, you parse t' but never use it afterwards. is that intentional?
biglama 2017-02-10 00:55:57
tsahyt: can you give me an example ?
tsahyt 2017-02-10 00:56:23
same in iter, you parse t and then don't use it later on
biglama 2017-02-10 00:56:24
t' should be used, but I did add it at the moment
biglama 2017-02-10 00:57:10
tsahyt: yes, should be used later, if I got the code working
tsahyt 2017-02-10 00:57:26
biglama: I suppose instead of the 0?
merijn 2017-02-10 00:58:27
biglama: tbh htop is pretty useless for determining RAM usage
_sras_ 2017-02-10 00:58:30
When I make lenses for a record, how can I make certain fields read only?
biglama 2017-02-10 00:59:37
tsahyt: oh right, I was debugging earlier on. sorry
tsahyt 2017-02-10 00:59:57
I was just asking because it's actually simpler to use it in applicative style than leave it out
biglama 2017-02-10 01:00:22
merijn: what do you recommend instead ?
biglama 2017-02-10 01:00:40
tsahyt: I'm not that good with applicative either
tsahyt 2017-02-10 01:00:49
biglama: how long does it take for you without profiling?
tsahyt 2017-02-10 01:00:58
I forgot to benchmark the before
merijn 2017-02-10 01:01:07
biglama: The profiling output of GHC that tsahyt mentioned
merijn 2017-02-10 01:02:14
biglama: The problem is that htop reports a lot of different memory statistics, and you need to know exactly what they mean to divine anything useful from them. For example, if you use GHC8 on x64 htop will basically just report every program, including hello world using about 1TB of RAM
tsahyt 2017-02-10 01:02:23
biglama: in the time parser, what's supposed to happen to t'?
biglama 2017-02-10 01:03:19
tsahyt: 67s (with 58s in cpu time) for the file 10 000 lines
tsahyt 2017-02-10 01:03:22
biglama: is 0.0,1,7.0,25.0,25.0,0.0,0.0,0.0,0,0,0,2,0 the correct output for this test case?
biglama 2017-02-10 01:03:33
merijn: is it the .prof file ? I posted it earlier : http://lpaste.net/352306
tsahyt 2017-02-10 01:03:36
what I did shouldn't have broken anything I think, just making sure
biglama 2017-02-10 01:04:51
tsahyt: you should have around 7000 lines like these
tsahyt 2017-02-10 01:04:56
that is interesting
biglama 2017-02-10 01:05:00
did you replace mySep by skipSpace ?
tsahyt 2017-02-10 01:05:03
yes
biglama 2017-02-10 01:05:32
tsahyt: the parser is a bit clunky for newlines and spaces
biglama 2017-02-10 01:05:56
tsahyt: which is why is used mySep, to avoid parsing newlines (it serves as a delimiter)
tsahyt 2017-02-10 01:06:34
alright that makes a difference
biglama 2017-02-10 01:06:55
tsahyt: the idea of the parser is basically to just reformat all lines from the input file
tsahyt 2017-02-10 01:07:37
takes 30 seconds now on this file
tsahyt 2017-02-10 01:07:43
I'll test the original
tsahyt 2017-02-10 01:08:33
but 30s still seems way too slow imo
biglama 2017-02-10 01:08:43
as a comparison, I have some C++ code which just reads the file line by line and print them in the new format
biglama 2017-02-10 01:08:49
takes 1s for 200 000 lines
biglama 2017-02-10 01:08:51
so yes
tsahyt 2017-02-10 01:09:21
is it really a line by line transformation?
merijn 2017-02-10 01:09:21
Well, one problem is that you're reading the entire file into memory, parsing than writing out, whereas the C++ one is probably streaming?
merijn 2017-02-10 01:09:34
In which case something like conduit/pipes would be more appropriate
merijn 2017-02-10 01:09:43
But also less beginner friendly :)
tsahyt 2017-02-10 01:10:29
agreed, a streaming abstraction would make a lot of sense in that case
biglama 2017-02-10 01:10:50
good point
biglama 2017-02-10 01:11:26
the format is a list of objects, where each object contains a set points coordinates
biglama 2017-02-10 01:11:45
mega/attoparsec gives me a lot of flexibility in the output format as I deal with data type
_sras_ 2017-02-10 01:12:16
When I make lenses for a record, how can I make certain fields read only?
tsahyt 2017-02-10 01:12:42
biglama: the encoding for your input is always ASCII?
biglama 2017-02-10 01:12:43
the c++ version just call readline() till the end of file
biglama 2017-02-10 01:12:53
tsahyt: should be, yes
tsahyt 2017-02-10 01:13:34
so every one of those blocks ends up as one output line then I suppose?
biglama 2017-02-10 01:14:21
tsahyt: each point (the Particle data type in the code) is written as one line, yes
biglama 2017-02-10 01:14:27
(at the moment)
biglama 2017-02-10 01:14:49
merijn: if I have to use streaming, I'll keep the c++ version
biglama 2017-02-10 01:14:57
merijn: I wanted some abstraction for once
merijn 2017-02-10 01:15:14
biglama: To be fair, there's some pretty nice streaming abstractions in Haskell :)
tsahyt 2017-02-10 01:15:19
biglama: have you seen the pipes library? I would call that a nice abstraction
tsahyt 2017-02-10 01:16:29
what you want (when opting for pipes) is pipes, pipes-{bytestring,text}, and pipes-attoparsec. the parsing code can look mostly the same as before, and you just have to plumb it together using pipes.
tsahyt 2017-02-10 01:16:40
which comes down to not much more code than the IO code you already have, but you get streaming for it
biglama 2017-02-10 01:17:39
tsahyt: so I get to keep my datatype ?
tsahyt 2017-02-10 01:17:50
Particle and Iteration? sure
insitu 2017-02-10 01:18:25
hello, I am trying to write some code for (de)serializing data type with the concept of versions
biglama 2017-02-10 01:18:27
tsahyt: that would be nice
tsahyt 2017-02-10 01:18:53
you have a Producer which reads from the file and shovels lines downstream, and a Consumer which eats the output from upstream and writes it to the file
insitu 2017-02-10 01:19:15
the idea is that you can have an older (byte-level) representation of some data type that you want to deserialise, knowing it has some version X
biglama 2017-02-10 01:19:25
tsahyt: okay, I'll look into it when I have some free time
tsahyt 2017-02-10 01:19:31
inbetween you have a parser, pulling lines as needed from upstream, pushing out Particles downstream, and a serializer which takes the parser output and converts it into text
biglama 2017-02-10 01:19:40
just wanted to introduce Haskell at work by showing how easy it was to use
biglama 2017-02-10 01:19:59
for that, it's a bit of a failure :)
tsahyt 2017-02-10 01:20:00
biglama: the pipes version would be a lot more elegant imo, and maybe even a better way to show off Haskell
biglama 2017-02-10 01:20:18
tsahyt: okay, thanks for the explanation
biglama 2017-02-10 01:20:31
but! it does not explain why attoparsec is so slow
insitu 2017-02-10 01:20:32
I made it work by going down into the plumbing of the deserialisation code
biglama 2017-02-10 01:20:32
:)
tsahyt 2017-02-10 01:20:34
of course you can also use conduit to do it or any other streaming IO library, but pipes is the only one I have worked with so far
merijn 2017-02-10 01:20:59
insitu: There's a library for that, I think
insitu 2017-02-10 01:21:03
greate
merijn 2017-02-10 01:21:28
@hackage safecopy
lambdabot 2017-02-10 01:21:28
http://hackage.haskell.org/package/safecopy
_sras_ 2017-02-10 01:22:04
is it possible to generate only getters using Lens package?
insitu 2017-02-10 01:22:12
thanks a lot, will look at it
insitu 2017-02-10 01:23:00
oh, I don't want to keep the old type around
biglama 2017-02-10 01:23:23
tsahyt, merijn : thanks for the help anyway
insitu 2017-02-10 01:24:11
and I would like things to be composable, so if I have a getter for version X -> Y and a getter for version Y -> Z, I can have a getter for X -> Z
insitu 2017-02-10 01:25:28
the version is a global property (within the limits of some domain) so I don't need to have it around for each element
tsahyt 2017-02-10 01:25:37
biglama: the most likely culprit that I can make out is the way you treat spaces, but I'm not sure
insitu 2017-02-10 01:25:58
but safecopy is definitely interesting to explore
biglama 2017-02-10 01:28:14
tsahyt: you think it could make the code run that slow ?
_sras_ 2017-02-10 01:29:01
When I make lenses for a record, how can I make certain fields read only?
lyxia 2017-02-10 01:29:51
make getters instead
_sras_ 2017-02-10 01:37:30
lyxia: how do I do that?
tsahyt 2017-02-10 01:39:32
biglama: dunno, the one big cost center is somewhere buried in attoparsec code, but it gets called by mySep at some point, unless I've counted the indents wrong
tsahyt 2017-02-10 01:43:04
biglama: good news
tsahyt 2017-02-10 01:43:19
I've converted it to use lazy text and now I get 7000 lines of output in about 0.2 seconds
tsahyt 2017-02-10 01:43:29
the streaming really does make the difference apparently
ondraa 2017-02-10 01:43:32
hello, what is the standard way to convert Int -> Text?
ondraa 2017-02-10 01:43:42
I can't find anything on hoogle
ondraa 2017-02-10 01:44:10
by convert I mean standard 1234 -> "1234"
tsahyt 2017-02-10 01:44:43
biglama: unfortunately I'm not entirely sure why the strict version is that slow in comparison, but at least you have a working solution now