Conversation
Notices
-
@crunklord420 I am trying to do some statistics in C. I usually used python, but want to try in C. I used Gnuplot to pipe data to it and plot, but it is significantly slower than using Python libraries. Do you have an idea how to make it faster?
-
@Saxophone3784 I'm not sure, probably something to do with your piping. There's overhead with things so generally you want to bundle your workload up and then do it as a single run to maximize throughput.
-
@Saxophone3784 there has to be a better way.
-
@crunklord420 yeah i thought about it. cant find a way to do it though. i have just a binary data file and i basically pipe it in gnuplot using fprintf one number at a time. the speed penalty comes because i have like 100k data points. it is a histogram, so i have to do it one number at a time, there are no x and y pairs.
-
@Saxophone3784 multithreading when? remember to interact with mutexes as little as often, otherwise you can destroy your performance.
-
@crunklord420 I was experimenting with it these two days. I actually opted for a different approach. Gnuplot allows to plot histograms, but you can pre-calculate a histogram by yourself in C beforehand. that way, if you have 100 000 datapoints you can reduce the number to whatever you want the bin amount to be and then feed that to gnuplot. this seems to be much better approach, although to be honest the speed of matplotlib of python is still close to it. Remember you once said you don't really know anything that python does better? I still think for a quick data visualization and manipulation it is the best tool atm.
-
@crunklord420 I am actually really enjoying minmaxing in C and making the programs run blazingly fast. It's a good feeling when you can really use your PCs resources efficiently.
-
@Saxophone3784 python is a slow language and people use it because I don't know, people tell them it's "easy". It's "glue code", any serious codebase is constantly calling into C libraries.
-
@crunklord420 But objectively, plotting stuff fast is easiest in python for me. Say you get a weird CSV data which you need to cut into pieces, inspect, plot specific rows/columns. You are not going to get same type of CSV data in the future, so it doesn't make sense to write a program that cuts the same rows/ columns. what tool would you use?
-
@Saxophone3784 I'm writing some tax tools for myself where I will be importing CSVs from various sources and doing big decimal integer math. I picked Rust, using serde and stuff.
I'm sure there's some CSV library for C, but I picked Rust because correctness is more important that performance. I have experience using serde.
I cringe to think about handling numbers in python because I just assume everything is going to decay into a double or something. If I want to write a 16bit integer in binary to a file I have to dig deep into like structpack whereas it's just incredibly straight forward in C.
-
@Saxophone3784 I really should say instead of the flexibility of C. You can make fast Rust.
-
@Saxophone3784 if you need a REPL, then Python has a REPL. I have at times used python to interact with binary file formats instead of writing a compiled program to do it.
Don't worry, most people who do this type of graphing stuff use python probably for the reasons you describe. On the other hand the majority of software written by data scientists is slow and extremely fragile due to dependency management.
-
@crunklord420 Idk, for my tax stuff i always use just the plain libreoffice, but it depends how much manipulations you do. I do very little, so I don't need anything more for that.
My usage for python currently is really for inspecting lots of measurement datasets and experimenting with them. Trying to plot various columns against each other, changing stuff etc. I honestly don't see it working in Rust fast. But maybe I am just too used to Python.
I would say I see Python at the moment as a rough prototyping tool and once you figure out what exactly you want, you can code it in a decent language.
-
@Saxophone3784 my taxes are complicated because capital loss/gains and currency conversion. I have to normalize data among multiple brokerages, etc.