Search Haskell Channel Logs

Monday, January 30, 2017

#haskell channel featuring opqdonut, nickager, tarragon, lpaste, zipper, Insanity_, and 8 others.

cocreature 2017-01-30 00:46:36
tarragon: I think "because the author likes it" is the best answer you are going to get for that question. he has written a bit about why he does so https://groups.google.com/forum/#!topic/pandoc-discuss/0rutNJAVKoc
nickager 2017-01-30 00:46:46
lyxia: so `decode` producing a non-lazy result, but does so efficiently using a lazy bytestring?
tarragon 2017-01-30 00:46:58
cocreature: interesting i'll read
tarragon 2017-01-30 00:48:05
cocreature: I find interesting the election of this language for such a big and popular project. Haskell must restrict the contribution to it due to limited ppl knowing it.
cocreature 2017-01-30 00:49:05
tarragon: for most opensource projects the author choses the language he likes, not the one he expects most other people to know
lyxia 2017-01-30 00:49:54
nickager: efficiently meaning it doesn't hold the whole bytestring in memory at once
tarragon 2017-01-30 00:50:29
wow, can anybody explain the difference of this statement? " If I were starting over, I'd use Text
tarragon 2017-01-30 00:50:32
everywhere instead of String."
nickager 2017-01-30 00:50:36
lyxia: thanks , that's helped
Insanity_ 2017-01-30 00:50:45
I have a problem that I _think_ I understand, but I would like to get some clarification:
Insanity_ 2017-01-30 00:51:01
I have made a small Haskell program that compares two sets, the data is in two separate CSV files
tarragon 2017-01-30 00:51:04
how does 'text' look different from 'string' in haskell or whatever his talking about?
Insanity_ 2017-01-30 00:51:25
So I want to find the set-theoretic difference. I can read the files and then use the Data.List.\\ operator to compare the two sets (A \\ B)
zipper 2017-01-30 00:51:33
tarragon: It's not about No. of contrib but quality of
Insanity_ 2017-01-30 00:51:43
This seems to work well for files with just a few entries, but it does not work for quite large files (1000+ entries)
Insanity_ 2017-01-30 00:51:55
Could Haskell's lazyness have anything to do with this, or should I be hunting for a bug?
opqdonut 2017-01-30 00:51:56
Insanity_: use Data.Set?
roxxik 2017-01-30 00:51:59
tarragon: String is a List of Char ([Char]) while Text is an opaque type backed by an array (i think)
opqdonut 2017-01-30 00:52:17
Insanity_: Data.List.\\ is quadratic (O(n^2))
lpaste 2017-01-30 00:52:44
Dylan pasted "No title" at http://lpaste.net/351791
Insanity_ 2017-01-30 00:52:48
http://lpaste.net/351791
Insanity_ 2017-01-30 00:52:58
Well, I should have given it a title, but that is the code
Insanity_ 2017-01-30 00:53:03
opqdonut: thanks for that information
Insanity_ 2017-01-30 00:53:15
I am not sure if that will solve the problem but I will try that now
Insanity_ 2017-01-30 00:53:26
Though then I would have surely not understood the problem at hand :D
roxxik 2017-01-30 00:53:31
Insanity_: try with some hundred entries and time it
Insanity_ 2017-01-30 00:54:01
oh wait I realise I might not have explained what the actual error is..
Insanity_ 2017-01-30 00:54:09
The problem is that for large files, the output is incorrect
tarragon 2017-01-30 00:54:15
roxxik: so why is the pandoc author saying that 'text' is better than 'strins'?
Insanity_ 2017-01-30 00:54:24
if I have a set of 1000+ lines, the result of A
Insanity_ 2017-01-30 00:54:46
A \\ B still contains the lines from B that it should not
opqdonut 2017-01-30 00:55:04
> [1,1,1] \\ [1]
lambdabot 2017-01-30 00:55:09
[1,1]
roxxik 2017-01-30 00:55:11
tarragon: the whole discussion is all about performance. text won't do you anything if you don't care about performance. but if you do, you have to either pack and unpack alot, or rewrite to only use text
opqdonut 2017-01-30 00:55:13
Insanity_: see, \\ only removes one entry if there are multiple
Insanity_ 2017-01-30 00:56:07
file A and B should be unique (I am getting data from Splunk, on which I am running 'distinct'), let me just verify that the data got through correctly
roxxik 2017-01-30 00:56:27
tarragon: first sentence from Data.Text doc: "A time and space-efficient implementation of Unicode text. Suitable for performance critical use, both in terms of large data quantities and high speed."
Insanity_ 2017-01-30 00:56:49
opqdonut: yeah, the data is unique
zipper 2017-01-30 00:57:04
Insanity_: You're using readFile from Prelude, right?
Insanity_ 2017-01-30 00:57:12
Yeah I am
zipper 2017-01-30 00:59:31
Insanity_: If you think it is laziness could you enable BangPatterns language extension and call it as so: `let difference = findSetTheoreticDifference (lineToList !aContent commaRegex) (lineToList !bContent commaRegex)`
zipper 2017-01-30 00:59:40
Insanity_: https://downloads.haskell.org/~ghc/7.8.2/docs/html/users_guide/bang-patterns.html
zipper 2017-01-30 00:59:57
That will force strictness and tell you whether laziness is the issue.
merijn 2017-01-30 01:00:02
Laziness won't result in wrong results
opqdonut 2017-01-30 01:00:03
zipper: err bang patterns don't work like that
merijn 2017-01-30 01:00:16
zipper: That's not a valid use of bang-patterns either
opqdonut 2017-01-30 01:00:19
zipper: and also forcing a list will only force the first cons-cell
Insanity_ 2017-01-30 01:00:21
I thought it would maybe compare part of incomplete files merijn
merijn 2017-01-30 01:00:34
Insanity_: How big are your files?
zipper 2017-01-30 01:01:02
Oh well lucky I tried to help and got helped
zipper 2017-01-30 01:01:08
opqdonut: How do they work?
zipper 2017-01-30 01:01:10
hmmmm
Insanity_ 2017-01-30 01:01:18
one is about 1.5k lines, the other is 709 entries (csv)
zipper 2017-01-30 01:01:21
Seems that's what I kind find
zipper 2017-01-30 01:01:32
That's not too much for a modern computer
zipper 2017-01-30 01:01:41
1.5 k lines aint much
zipper 2017-01-30 01:02:00
That's like 300KB file?
Insanity_ 2017-01-30 01:02:02
No, I thought so as well. But I am not sure how the whole lazy deal works in Haskell :-)
zipper 2017-01-30 01:02:02
at most?
opqdonut 2017-01-30 01:02:07
zipper: you use them in a pattern, e.g. a function definition "f :: Int -> Int -> String; f !x !y = show (x+y)"
merijn 2017-01-30 01:02:13
Insanity_: Honestly, you should probably not be using readFile for Prelude for this to begin with
zipper 2017-01-30 01:02:27
merijn: Yeah lazy bytestring?
merijn 2017-01-30 01:02:32
No, Text
opqdonut 2017-01-30 01:02:36
Insanity_: merijn: but it should work using readFile
merijn 2017-01-30 01:02:42
opqdonut: Agreed
Insanity_ 2017-01-30 01:02:54
opqdonut/merijn: Well, that puts me back in square one :-)
zipper 2017-01-30 01:02:57
merijn: When should I use bytestring over text and vice versa?
zipper 2017-01-30 01:03:11
I see stuff that reads over the net uses bytestring
opqdonut 2017-01-30 01:03:13
zipper: bytestring for binary, text for text
merijn 2017-01-30 01:03:14
zipper: Simple answer: pretend 'ByteString' is called 'Bytes'
zipper 2017-01-30 01:03:27
I assume because you can have yeah binary data like pics over the wire
merijn 2017-01-30 01:03:32
zipper: The name is an unfortunate historical accident and it has nothing to do with strings
zipper 2017-01-30 01:03:33
I see
zipper 2017-01-30 01:03:49
For example:
zipper 2017-01-30 01:04:02
:t Network.Wreq.get
lambdabot 2017-01-30 01:04:04
error:
lambdabot 2017-01-30 01:04:05
Not in scope: 'Network.Wreq.get'
lambdabot 2017-01-30 01:04:05
No module named 'Network.Wreq' is imported.
merijn 2017-01-30 01:04:08
Other than "unicode strings can be encoded as binary data"
merijn 2017-01-30 01:04:55
zipper: So normally when you talk on the network you get binary data and if you want to treat it as Text you need to know how it's encoded and then decode from binary to Text
Insanity_ 2017-01-30 01:04:58
merijn: If lazyness is not the issue, do you have any idea where the issue might be regardless of that?
merijn 2017-01-30 01:05:24
Insanity_: tbh, I just saw 4 lines of code out of context and I don't know what the types/implementations of anything are
zipper 2017-01-30 01:05:29
Insanity_: Do you have any idea where it breaks?
zipper 2017-01-30 01:05:39
Insanity_: print debugging?
zipper 2017-01-30 01:05:49
Do you actually read both files?
Insanity_ 2017-01-30 01:05:59
Well there really is not much else to it apart from those lines. I can show you the missing lines (which is two of them)
Insanity_ 2017-01-30 01:06:11
And the odd thing is that it does work for all other files I tested that are smaller
lpaste 2017-01-30 01:07:16
Dylan revised "No title": "compare-csv-files" at http://lpaste.net/351791
zipper 2017-01-30 01:07:17
Insanity_: You should find out Line x finishes execution but Line x+1 doesn't. Then try figure out what happens at the point of breakage.
Insanity_ 2017-01-30 01:07:54
I'll see if I can find some more information about the point where it breaks exactly :-)
merijn 2017-01-30 01:08:26
Insanity_: So, why are you using lists and \\ rather than Set for set difference?
Insanity_ 2017-01-30 01:08:44
Because I only found out about that about 5 minutes ago ^^
merijn 2017-01-30 01:09:00
Insanity_: I have no clue what lineToList does, btw
Insanity_ 2017-01-30 01:09:00
I am still very much learning Haskell on the side :p
Insanity_ 2017-01-30 01:09:03
ah
merijn 2017-01-30 01:09:05
Set will be faster and more efficient
Insanity_ 2017-01-30 01:09:06
splitRegex regex input
Insanity_ 2017-01-30 01:09:25
just splits the line of input with the regex mkRegex ","
Insanity_ 2017-01-30 01:09:44
Yeah someone explained that to me a bit earlier, I will revise that
zipper 2017-01-30 01:10:47
Maybe your "parsing" is failing.
Insanity_ 2017-01-30 01:13:36
zipper: Good point, perhaps the output of some of the entries has different line endings or something subtle. I'll check with smaller files with just the failing entries
merijn 2017-01-30 01:14:49
If you have quoted columns inside your CSV this fails horribly anyway
zipper 2017-01-30 01:15:10
Insanity_: How about a better debugging method than manually reading a CSV file you guys.
merijn 2017-01-30 01:15:43
tbh, I would start with just getting a CSV parser
Insanity_ 2017-01-30 01:15:55
Well yeah.. We actually do have one.. We are writing in Java normally
zipper 2017-01-30 01:15:56
:)
Insanity_ 2017-01-30 01:16:06
I just wanted to see if I could get something quick out of Haskell to do the same thing :p
zipper 2017-01-30 01:16:30
Insanity_: Call lineToList with each file in the repl
zipper 2017-01-30 01:16:48
In the repl readFile then lineToList
zipper 2017-01-30 01:16:53
then for the second file
zipper 2017-01-30 01:17:26
Ok this is offtopic Haskell and now into general debugging
Insanity_ 2017-01-30 01:18:58
Thanks, I'll play around with it and see ;-)
Insanity_ 2017-01-30 01:24:29
*facepalm* you put me on the right track zipper. The issue was actually one of the CSV files being spit out of some of our tools not generating pretty CSV files, but the viewer handled it correctly.. Anyway, after a small fix to the files it now works
zipper 2017-01-30 01:30:00
So a CSV file not being pretty printed breaks haskell's readFile?
zipper 2017-01-30 01:30:06
Insanity_: ^
merijn 2017-01-30 01:31:07
zipper: No, it breaks his "parser"
Insanity_ 2017-01-30 01:31:42
merijn: Yeah well, the "parser" in Haskell was correct, but well, when one of the tools misses putting a "comma" in, it's a pretty CSV generater
Insanity_ 2017-01-30 01:31:47
pretty shitty*
merijn 2017-01-30 01:32:01
Ah, I suppose that could also happen :p
Insanity_ 2017-01-30 01:32:22
Yeah, it shouldn't, but it did :p
be5invis 2017-01-30 01:32:56
you know, sometimes i have strange imaginations, like: in the Cold War, Russians choose to use ternary computers with a purely functional language...
be5invis 2017-01-30 01:33:43
they are completely separated from the western world, and their mathematicians, like Kolmogorov, decided to build their own computer and the entire theory...
_deafbeef 2017-01-30 01:37:43
hello folks, windows 10 64-bit user here, using stack; I've tried following the official documentation on compiling DLLs for foreign languages, over at this page: https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/win32-dlls.html#making-dlls-to-be-called-from-other-languages
_deafbeef 2017-01-30 01:38:25
here is the result, all the files are the same as in the documentation, except for the changes that are explicitly stated in the following picture: https://i.snag.gy/zFv9wa.jpg
_deafbeef 2017-01-30 01:38:28
any help welcome!
Profpatsch 2017-01-30 01:43:28
Is there a module annotation to disable a certain kind of warning?
Profpatsch 2017-01-30 01:43:57
I'd like to disable -Wtype-defaults and -Wmissing-signatures for one module of mine
Profpatsch 2017-01-30 01:43:58
.
lyxia 2017-01-30 01:44:43
{-# GHC_OPTIONS -Wno-type-defaults #-} ?
merijn 2017-01-30 01:44:44
Profpatsch: I think you can use a pragma to indicate flags to use when compiling a module
merijn 2017-01-30 01:44:54
Profpatsch: But it's a bit...questionable
merijn 2017-01-30 01:45:06
I'd hate having to open every file to see which flags it uses