Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Unpacking Elixir: Concurrency (underjord.io)
329 points by lawik on Aug 25, 2023 | hide | past | favorite | 131 comments


I think a few code example would have been nice.

The thing I like about concurrency in Elixir is that it's there if you need it but it's mostly not mandatory in your code, compared to Javascript where its kind of imposed on you even when it's an hindrance.

I remember one trick I used to do with LiveView on click events etc was to put all async (or "asyncable") code in a spawn function, which would speed up the return.


Yes exactly! This is super important for people new to elixir. The vast majority of the time, you don't even need to know that there is even any concurrency. You can write entire sophisticated Phoenix apps and never need it. But when you do eventually need it, it's there and it's wonderful, once you grok the pattern.


Yes, I know this is mostly a +1 comment, but this is a huge differentiator from other systems I have worked in where "oh I need concurrency..." becomes a question of what libs, styles, where are the gremlins in the library, where are the state gremlins in our code, etc etc. Vs with Elixir you just sort of go "actually I need two of these. actually I need 200 of these. actually..." and it "just works".

Not that there isn't any stuff to learn there, you have to understand how actors pass messages, how you can unintentionally bottle neck via call (caller waits for a reply) vs cast (caller keeps going), etc but its very surprise-free because, surprise surprise, Erlang has been built from the ground up for concurrency and generally worked out how it should work 30 years ago instead of bolting some keywords in.


Agree. That's a big advantage of thread-based concurrency [0] vs async/await. The decision of whether to do things sequentially or concurrently sits with the caller, not the callee. Which means it's there where you need it and out of the way when you don't. With async/await, the decision is made by a function implementer somewhere down the stack, quite probably in 3rd party code. As the caller you get no choice because coloured functions [1]. You have to deal with concurrency whether your app needs it or not.

--

[0] By thread I mean the general concept, not specifically OS threads. In this case it includes the fine-grained processes offered by the BEAM VM that underpins Elixir and Erlang.

[1] https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...


Thread-based concurrency is a big part of it, but another nice thing is that mostly synchronous APIs in Erlang are built upon an asynchronous send, followed by an immediate wait. If/when you decide that you don't want the synchronous API, you can usually relatively easily separate the two parts, and process them differently, without having to spawn more threads to do each thing synchronously and then join the threads the way you want.

Building on asynchronous sends also makes it conceptually the same to make a local request vs a request on a remote node, although with some additional failure conditions that don't usually make sense to worry about on a single node system.


Is thread-based the right terminology for Erlang's lightweight processes? I guess the pattern is comparable.


I remember building a Liveview dashboard that eventually had A LOT of things in a table being updated in real time, that the browser tab actually used like 15% of CPU. I implemented batching of websocket updates and added sleep() in interim genserver loop in like 1h... Problem solved, I could just change the sleep value depending on how fast I wanted the updates to happen


Unlike javascript, elixir makes you write synchronous code, e.g.

    def process_items(items) do
        items
        |> Task.async_stream(&Processor.process/1, max_concurrency: 2, timeout: 7000, on_timeout: :kill_task)
        |> Enum.to_list()
    end
Semantics of regular and concurrent code is the same


Node has kind of wrecked the term "synchronous". It convinced many people that "synchronous" also entails "only thing running on the processor at the time" and "if you're 'synchronously' waiting on a file read the entire process is waiting". So when you say "Elixir/Erlang/any threaded language 'makes' you write synchronous code", to the people who most need to hear this, those steeped in the async way of the world, you confuse them because to them that means your language can only do one thing at a time.

I'm not saying I'm happy about this or that the meaning has "really" changed; I'm just saying from experience you don't really want to phrase it this way because you only reach those who already know.


Thats so sad, because any truly parallel workload runs synchronously, on multiple processors, not asynchronously (i.e. on as little as a single processor)


comment here is conveying that there is no function color, where a language distinguishes standard synchronous calls from asynchronous calls


Very convenient to parallelise HTTP queries (and without the need to go “evented”).

One recent example where I assert that API responses match our OpenAPI specifications here, for the curious:

https://github.com/etalab/transport-site/pull/3351/files#dif...


This. I don't want to be in an event loop by default.


Hate to break it to you but in the BEAM you're always in an event loop.


No really. The scheduler isn’t really like an event loop. The scheduler will only allow a task so many units of execution before pulling it off the processing queue and scheduling other work. You can block an event loop, but not the scheduler.


Adding preemption doesn't make it not an event loop, IMHO.


The context of the discussion was the node event loop, which doesn’t have preemption. The BEAM scheduler is only like an event loop when you’re locked to one physical processor core, otherwise the schedule is spreading work across multiple OS level threads. You could squint and call this an event loop manager with preemption, but I think the phrase “event loop” detracts from the clarity of any correct description.


At the limit, there is always an event loop in the kernel. The question is how leaky are the abstractions on top.


you can replicate this using primises

source: nodejs dev for 8 years, now fulltime elixir dev for the last 5


Disckaimer: I have never used Elixir in any serious capacity, but I have done a good chunk of Erlang.

Concurrency in Erlang sort of frustrates me...not because it's bad, but because when I use it I start getting pissed at how annoying concurrency is in nearly every other language. So much of distributed systems tooling in 2023 is basically just there to port over Erlang constructs to more mainstream languages.

Obviously you can get a working distributed system by gluing together caches and queues and busses and RPC, but Erlang gives you all that out of the box, and it all works.


    Any sufficiently complicated distributed program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Erlang


I've done this multiple times. I will not make the mistake again. Next big system is being born in Elixir.


Well, give me a call if you need help. Just looking for work again lately.


Feel free to shoot me your cv/resume to the address in my profile! We just hired two new engineers this week, so the timing is not perfect, but I would like to have your info around for a rainy day which might come sooner than later.


Just did. Thank you.


Same as @pdimitar. I’m looking for part-time work.


please see comment to @pdimitar and do the same ;)


That has definitely been my experience. There's been a good number of times I end up gluing things together with ZeroMQ and SQS and Redis where I start thinking "you know this would have been easier to get an equivalent (or better) product in Erlang".

Sadly, I've only ever had one job at a shady startup that used Erlang; I would love to work for a company that sees its power, but sadly it seems that everyone who has not drunk the Erlang koolade thinks that Go is a suitable replacement.


But go only has 3 or 4 ways you can shoot yourself in the foot at every turn with concurrency! Better than C++, I guess.


I don't really mean to crap on Go, I actually think it's a reasonably ok language all things considered; if nothing else (at least on one machine), it gives you nearly the highest performance-to-effort ratios.

It just annoys me that people see "Go is good at concurrency", and then see "Erlang is good at concurrency", and then assume Go ~= Erlang, which is not the correct conclusion.



This is coming from someone who likes Elixir. Not much for its distributed systems features, but mostly because of the language design. I keep hearing everyone talk about how Erlang/Elixir gives everything out of the box and you don't need to worry about Queues, RPC or whatever... But in reality, people don't really recommend using Distributed Erlang that much, on most Elixir gigs I worked, they didn't use Distributed Elixir at all, just plain old Kubernetes, Kafka, Redis and GRPC.


> I keep hearing everyone talk about how Erlang/Elixir gives everything out of the box and you don't need to worry about Queues, RPC or whatever...

Many companies are using Distributed Erlang but not the way you described: they are using it to exchange messages between nodes running the same version of the software. Imagine that you are building a web application, you can directly exchange live messages between nodes without Redis/RabbitMQ [1]. If you are deploying a machine learning model, you can route requests through multiple nodes and GPUs without additional deps [2]. If you want to see which users are connected to your app, you can exchange this information directly between nodes [3].

In other words, there is a subset of distributed problems that Distributed Erlang solves very well out of the box: homogeneous systems working on ephemeral data. And some of the scenarios above are very common.

If you need to communicate between different systems or you need to persist data, then I 100% agree with you, I would not use Distributed Erlang (at best it would be one part of the solution). I think this distinction gets lost in many conversations and sometimes it leads to false dichotomies: "why Erlang when I have k8s/grpc/redis?" while in practice there is not a lot of overlap. I have written about Erlang/Elixir vs Redis [4] and vs k8s [5] for those interested.

[1]: https://hexdocs.pm/phoenix_pubsub/Phoenix.PubSub.html [2]: https://news.livebook.dev/distributed2-machine-learning-note... [3]: https://hexdocs.pm/phoenix/Phoenix.Presence.html

[4]: https://dashbit.co/blog/you-may-not-need-redis-with-elixir [5]: https://dashbit.co/blog/kubernetes-and-the-erlang-vm-orchest...


> In other words, there is a subset of distributed problems that Distributed Erlang solves very well out of the box: homogeneous systems working on ephemeral data. And some of the scenarios above are very common.

Speaking of which, I'm looking forward to using Broadway [1] in a new project here in my company. Here, people are using an enterprise integration engine specialized in the healthcare space [2], with built-in single-branch version control and all actions going through the UI.

As I come from a background of several years with Ruby on Rails, I really hope to convince people to use this great library/framework your company created, since RoR is severely lacking when handling heavy concurrency like when gluing multiple APIs in complex workflows. Software engineers are going to love it, but integration analysts are used to IDEs with GUIs, so we'll need to create a pretty admin dashboard to convince them to switch.

[1] https://elixir-broadway.org/ [2] https://rhapsody.health/solutions/rhapsody/


We use it across multiple versions of our software running in the same cluster. As long as you dark launch API changes, it's not much of an issue.

https://www.youtube.com/watch?v=pQ0CvjAJXz4

(Doh, just realized I was replying to Mr. Elixir himself! And you're familiar with our project anyway :)


Well, in big orgs, that adopt Elixir/Erlang, along with other technologies with poor concurrency stories, those other ecosystems still need Kubernetes, Kafka, Redis and GRPC, to get by. elixir isn't going to make ruby or python apps magically concurrent. So that make sense.

However, in orgs that are primarily Elixir shops, I don't see a lot of Kafka or gRPC. (Redis is just useful, its more than just a queue and K8s and Elixir/Erlang compliment each other, btw.)


>those other ecosystems still need Kubernetes, Kafka, Redis and GRPC, to get by

And what makes Elixir not need Kafka, Redis or GRPC?

Instead of Redis, you could use ETS for caching. But once you have 2+ instances of your app, you will need to have a centralized caching mechanism, otherwise, each instance will have its own ETS with its own memory, not sharing anything. Unless you decide to use Distributed Erlang and connect all the nodes of your application, which comes with a lot of trouble. Much easier to just use Redis.

And lets say you have multiple teams, each team has its own service(built with Elixir), and you need to have some async communication between those services. What native Elixir solution would you use instead of Kafka?

Same for GRPC. What's the alternative? Connecting all the nodes and using Erlang message-passing?


I think Elixir/Erlang + Redis pub-sub + PostgreSQL is the sane minimal subset for distributed and scalable systems.

Just say no to Kafka.


Where kafka comes in though is no-code db following. These are super handy when you want to be informed of changes to a table, but don't know which "micro-service" is changing the data. KafkaConnect is very handy.

Though I will concede, it's a bit of a bazooka for a mosquito, sort of thing.


After using NATS I don't think I'll ever want to use redis pubsub again.


ETS is quite faster than using Redis though.


single node though, so you have to add distribution in some manner. For some situations, whether its the "right" way or not, redis ends up being an easier way to go.

ETA: As another couple of comments pointed out, ETS also dies with the node so you've got to handle that also when rolling it.

ETS is cool, but its not a panacea.


Nebulex has different adapters although I've only use it on a local node which uses ETS and with transient data so I can't comment on them too much.

https://github.com/cabol/nebulex


> Connecting all the nodes and using Erlang message-passing? Most of the time, yes.


well my startup uses distributed elixir. we use it horde to distribute certain long lived processes accross our cluster. that does not exclusive to kuernetes. we user kubernetes to manage and handle vm crashes (its only happened twice ever) as well as porvide consistent network toplogy between the nodes.

that said, having the ability to send messages between machines without having to bring in an external dependency like redis is AWESOME from a maintenance perspective. one less thing to worry about and the system just works. our uptime has been pretty amazing given our small team.


Could you explain a bit more how you are using it?


our payment processor requires a persistent connection to their server for each of our clients. we use horde to make sure there's only one process responsible for each paying user accross our cluster.


Elixir/Erlang often simply replaces most of what Kafka, Redis and GRPC offer.

Also, have a look at the Phoenix Framework Channels examples, as it essentially replaces most simple micro-services architectures.

This recipe book covers the common design examples:

"Designing Elixir Systems with OTP: Write Highly Scalable, Self-Healing Software with Layers" (James Edward Gray, II, Bruce A. Tate)

One day, you too may get annoyed with IT complexity, and articles mostly written by LLM chatbots.

Happy computing, =)


Sasa Juric makes this point in 'Elixir In Action' and some of his talks, where in other languages you need to pull in these other technologies near the start, whereas in Elixir you can use the built-in primitives until you start running into scaling problems.

- https://youtu.be/JvBT4XBdoUE?si=Xo0QXgVSI2HCg8pj&t=2198


"just plain old kubernetes" is an oxymoron, at best.


> they didn't use Distributed Elixir at all, just plain old Kubernetes, Kafka, Redis and GRPC

There must be a good rationale for that decision. Do you know what it is?


There's a good article about BEAM + k8s by José Valim [0]

[0] https://dashbit.co/blog/kubernetes-and-the-erlang-vm-orchest...

You could certainly get away without some of the other stuff but, as another comment has mentioned, it requires some infra know-how. Like, you can't "just" use ETS as a Redis replacement without setting it up in a way that its data won't get blown away during a blue-green deploy.


There may or may not be “good” rationale. Could just be that most people using elixir are coming from other languages/ecosystems where all of that is normal.

Also in my experience, most of the time, the infrastructure team doesn’t know anything about elixir.


Cargo culting happens regardless of language. It's true that fully meshed distribution won't fly for Google or Amazon scale. But 99% of companies will never get to that scale, despite what they'd like to believe. Fully meshed distribution works just fine for many use cases.


> It's true that fully meshed distribution won't fly for Google or Amazon scale.

I'm not sure I agree at product level: WhatsApp seems to be scaling rather well. I can't say if their use of Erlang is "fully meshed distribution" or not, but it seems to be flying just fine as the world's number 1 messaging platform.


I fell in love with Erlang pretty quickly, and it’s hard for me to enjoy writing other languages. Simple concise syntax, pattern matching, immutability, error handling without branches all over the place, concurrency… it’s hard to walk away from that.


Have you tried Elixir? So many blogs etc. talk about coming to Elixir from Ruby, which I don't understand, little similarity beyond def...do...end.

Elixir uses the same VM with a different, some including myself would say better syntax. Do you still prefer Erlang syntax?


I dislike Elixir’s syntax for two reasons: it seems unnecessarily verbose, and it’s too much like Ruby/Python.

When I see Erlang syntax, it helps me think in Erlang.

Anyway, I haven’t given Elixir a fair chance, but it’s hard to get past my knee-jerk dislike of all the extra verbiage. I’d rather use LFE if I wanted an alternative syntax with more powerful macros.


regarding concurrency, language plays an important role and pretty much dictates how code is written which is i believe where it has most of the frustration. in erlang it’s in a functional style, in javascript it’s in an asynchronous style. what i’ve come to realize is that it’s still better and more maintainable to think synchronously and have the core of the tech handle concurrency, for example golang with it’s goroutines or multi process in ruby/python. admittedly it’s not as concurrent/distributed as erlang but should be easier to work with.


thats the thing though, elixir you can write it in a synchronous style and to make it concurrent is usually very easy because the semantics of regular and concurrent code is essentially the same.

for example refactoring something like this:

    File.stream!("path/to/some/file")
    |> Stream.flat_map(&String.split(&1, " "))
    |> Enum.reduce(%{}, fn word, acc ->
      Map.update(acc, word, 1, & &1 + 1)
    end)
    |> Enum.to_list()
to be async, concurrent, (Flow uses GenServers under the hood, A GenServer is a process like any other Elixir process and it can be used to keep state, execute code asynchronously and so on.)

    File.stream!("path/to/some/file")
    |> Flow.from_enumerable()
    |> Flow.flat_map(&String.split(&1, " "))
    |> Flow.partition()
    |> Flow.reduce(fn -> %{} end, fn word, acc ->
      Map.update(acc, word, 1, & &1 + 1)
    end)
    |> Enum.to_list()
The point is the code isn't radically different and is easy to understand if you understand the first block of code.


Yeah— as a primarily python guy, I find concurrency much more palatable in elixir than in js. As a relatively infrequent js user, I have tripped on its asynchronicity for decades. After learning in Perl, shell scripting, and a smidge off C in the early 2ks, then PHP and python in the subsequent decade, I rage quit js every time I picked it up for anything significant until like 10 years ago. For the first few years, I always forgot basic facets of the language like the scope of "this" in anonymous callback functions.

While it's not a close analog, the way Unreal Engine's node-based "no code" Blueprints language approaches asynchronicity just feels so much more natural. Even hopeful pre-hello-world coders seem to conceptually get "well I can't get that because this hasn't happened yet." Having graphical representations of things both in nodes and in-game obviously helps in ways that wouldn't make sense in js, but being built from the ground up to handle it does show that it can be approached more intuitively.


Node's "experimental" stream api also leads to code with similar semantics. It gives some additional concurrency options, but of course no parallelism outside of waits.

    createReadStream("file.txt")
      .flatMap((chunk) => chunk.split(" "))
      .reduce((acc, word) => {
        acc[word] = (acc[word] ?? 0) + 1;
        return acc;
      }, {})


Can you give some examples?


If you want to dig into some code:

The Elixir 'Getting started'[0] guide has you building a concurrent, distributed KV store using nothing but the basics of OTP (effectively the std of BEAM).

0. https://elixir-lang.org/getting-started/introduction.html


I liked this approachable talk I watched few years ago. I wish developing in Elixir was as easy as python.

https://youtu.be/xoNRtWl4fZU


I've found Elixir to be easy to understand given what it's doing, and it's doing a lot more work than Python does for each application. The thing is, though, that there are a lot of declarative frameworks in Python that make applications go further without burdening a developer with low-level detail. Elixir does the same thing. Elixir and OTP abstract away a lot of complexity, at times too much. Import enough frameworks into your Python application and you'll be making a lot more decisions. Thinking about process management for every application you write (BEAM process, not OS process) pushes you up front to think about the paradigm that is used everywhere, and once you grasp what you're considering you can apply your understanding of concurrency when using other languages. This often happens when an Erlang/Elixir developer works with Rust: many programmers adopt message-passing rather than sharing state, and as they make their system more robust think about how to control threads/tasks, adopting OTP-like paradigms.


Coming from (a lot of) Ruby which is quite close from Python IMO, initially I felt Elixir was harder, but it was just because of my habits, and the huge "comfort vendor lock-in" that both Ruby & Python provide.

Ultimately I am still as fluent with Ruby as before, but I find developing with Elixir is easier than Ruby (it took a few years to get there though).


What do you find more difficult?

The achieving tasks in it or the getting paid to do so part?


There’s definitely a bit of a learning curve if you’ve never written in a language where everything is immutable. Certain problems have to be approached in a fundamentally different way if you can’t mutate state.

Fortunately there’s Agent/OTP, but again, pretty different than other (or popular with beginners - js/python/java) languages.


You really shouldn't be using agent for mutable state. Kind of annoying that the docs push you to that. Agent is most useful when you need to tie the transient lifetime of some state with something else (think: compilers). Otherwise, it's generally better to use a database, (or ets, if you need performance), GenServer/gen_statem if you need reactive events attached to the state. Agent is just a wrapper over GenServer so it is strictly worse in terms of performance, and honestly, slightly hard to grok as to where actions take place (and thus, who is responsible for failures)


I love elixir and am very comfortable with functional programming and generally prefer it. But being told you have to write a state machine instead of just += a var is an excellent experience in believing elixir is harder than python.


Of course you example is hyperbole and I certainly understand what you are saying. But the attitude that many programmers take when learning a new language that something is "harder" when it's just different than what they are used to bothers me (to be clear, I'm not directing that at you).

But comparing:

    i = 0
    for x in [1, 2, 3, 4]:
        i += x
vs

    i = Enum.reduce([1, 2, 3, 4], fn x, i -> i + x end)
There is nothing inherently "harder" about the functional version, it's just different (and of course comes with its own benefits).


I find that pure functional loops with reduce or fold often get a lot harder to read/write as soon as you have more than one variable to keep track of. Imperative loops don't really have this problem.


For me it's the other way around. With reduce you have to explicitly declare what you're keeping between each iteration.

When you have an imperative loop the state of any of the variables youve declared is up in the air. It gets even worse if you've shadowed variable names.


Especially when

  Enum.sum([1, 2, 3, 4])

Exist ;)


...and Python has

    sum([1, 2, 3, 4])
so obviously not my point :P

;)


In reality if you’re just trying to += a variable, you’d use reduce (if you couldn’t use one of the built in functions).


If faced with the one-off problem shown in the video, I would have reached for a python solution like joblib etc. But knowing efficient better way to do would be additional tool. As shown in the video, it took him jumping through multiple hoops and expert help to get to his solution and without knowing beforehand if all the required pieces can be put together in Elixir ecosystem. No such hurdle in python solution.


Elixir with static typing would be an absolute dream. I know there's https://gleam.run/


There are plans to introduce type system into Elixir:

https://elixir-lang.org/blog/2023/06/22/type-system-updates-...


It's not possible, because functions, pattern matching, guards, typespecs and success typing are fundamentally incommensurable: each specifies and/or enforces a subset of the other, they are all useful in their own way, but not isomorphic.

Formal attempts have repeatedly failed, even when they exclude message passing, which is the most difficult (and most useful) aspect. If Wadler cannot do it, I'm sure I (or you) cannot.

A typesafe concurrent language is possible, but only a new language layer on top of E/E and the BEAM.


Are you aware of any of the recent advances?

There is the set theoretic types work [1] lead by Jose and a couple of PhDs and also eqwalizer by WhatsApp [2]

[1] https://elixir-lang.org/blog/2022/10/05/my-future-with-elixi... [2] https://github.com/WhatsApp/eqwalizer


In addition to gleam, there's also ongoing attempts to add static typing to Elixir.


Are you saying Gleam has failed?


I like this Underjord guy. I've watched some of his YouTube content on Elixir. He's good at explaining concepts, has a natural teaching ability.


Made my day


Speaking of Elixir, I wrote a simple guide on getting an Elixir web app set up in podman (safer Docker). Also it uses Plug to demonstrate how the framework Phoenix is built.

https://spacedimp.com/blog/dockerless-setting-up-an-elixir-w...


Thanks for the article. I haven't written much Erlang.

I am looking for a better representation of concurrency and in my thinking I've found that it feels easier to understand is a timeline grid with rows for independent processeses and columns for time with demarcations for events. You could say a sequence diagram is closest but I'm also thinking of Chrome developer tools with its renderings for rendering, painting, javascript etc.

http://bloom-lang.net/ solves the nonorderly concurrency problems between machines with lattices.

I am unsatisfied how concurrency is represented in the mainstream languages. I find Rust async hard to understand and read at a high level from a schedule perspective. The function colouring issue is painful.

I want an interactive programming environment that lets me create a "tcp-connection" for instance and then register handlers for "on-ready-for-reading" and "on-ready-for-writing" which are system events that are triggered from IO threads that run epoll or liburing. You don't want to block the IO thread loop so you dispatch an event to another thread.

I've been designing a syntax for representing state machines and events that can be fired from different places, it looks like this:

   initialstate = state1 | state1a state1b state1c | state2a state2b state2d | state3
It waits for initialstate to be reached, then waits for state1 then it waits for state1a state1b state1c in any order and so on.

Ideally we want data flow to be a tree and sprinkled with synchronization points which are barriers where independent tasks synchronizes, where we exchange tasks and data. Shared memory synchronization is great for the amount of data that can be transferred in one-go (you're not writing lots of data into a pipe as in multiprocessing or into a buffer in message passing, message passing can be O(1)) but I don't want to do it on the hot path, Amdahls law.

Another program I've written with Java runtime executes the following program:

  thread(s) = state1(yes) | send(message) | receive(message2);
  thread(r) = state1(yes) | receive(message) | send(message2);
I've been trying to design a multithreaded server architecture here https://github.com/samsquire/three-tier-multithreaded-archit...

You could implement complicated interlocking workflows with this syntax, because you just need to wait for events as defined.

LMAX Disruptor gets some good requests per second


Can i recommend you to take a look at effect handlers languages? It is not mainstream but it may prove useful for your thinking here.

Effekt or Ocaml 5.0 new events handler system may help get ideas that help you achieve your goal here.


> If I do want to do concurrent work there is the usual async/await mechanisms, implemented in the Task module. Not as weird as in JS. They just return a Task reference which is a helpful abstraction on top of the Process ID or PID. Then you can await on it to get a result or await on multiple to get multiple results. There is also a wicked function called Task.async_stream which will take an enumerable (list, map, stream or similar) and run a Task for each entry as a lazy stream. By default it has a concurrency matching the number of available cores (just like the schedulers). This is essentially a fancy shortcut for using all your machine has to offer in the service of getting the work done for tasks that are embarrassingly concurrent. It is very fun to use when bodging together scripts that you want to go fast.

Hmm, am I reading about Elixir or C#... ;-)

Yes, there are other major differences but this entire paragraph can be taken verbatim for C#.


One of those major differences would be that each ‘process’ is incredibly lightweight, and a memory model that allows you to spawn millions of them without having to think about memory allocation or cpu contention


Yeah, the API is very similar.

I don't know the .Net runtime well enough to say anything about how it actually executes though.


The deal-breaker for me with Elixir has always been the lack of real Elixir vectors rather than the crappy Erlang array library. Yes, I know you can use a Map with numeric keys but that's not the same.


One of the many reasons I left Erlang is the lack of user types. I understand how the BeamVM got there. You build into the system the idea that there may be multiple nodes that communicate over the network. Those nodes will routinely pass through states where they are on different versions of code. Those types may have to be upgraded at upgrade time. Having no user types in your type system, just a fixed set of dynamic types that are relatively simple, mean that when it comes time to upgrade, you don't have to figure out how to load two method sets for the same module at the same time (and the corresponding multiplicity of states that can emerge beyond that); the new code can get the old value safely and easily since it has no methods on it, examine it, and upgrade it in a principled manner.

Nifty, powerful, and simple, like so much of Erlang.

But also like so much of Erlang, I think the modern approaches (several of them) that languages take to serialization is better. It's a good first pass cut at the problem, but I prefer all of the GRPC approach to the problem, the JSON approach to the problem, and honestly just letting the chips fall where they may with most modern serialization libraries. Treat the remote system as not entirely trusted and handle the messages with a bit of skepticism generally works out for quite a bit of scaling. And you get user types back, which means you are no longer stuck on the BeamVM's quite anemic data types.

If you look at the underlying implementations of an Erlang map, you'll see why you're not getting vectors anytime soon.

    1> dict:append(a, b, dict:new()).
    {dict,1,16,16,8,80,48,
          {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
          {{[],[[a,b]],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}
It's a big pile of linked lists storing assoc lists held together by tuples, with all the component value being dynamically typed. It's nice they didn't cheat on that, but it is... not the most efficient approach to dictionaries, just the one enabled by their type system, such as it is.


Note that Erlang has maps since OTP 17, which I believe are implemented more efficiently than dicts.


Good point, though I think my point about user types still holds. It is very difficult to create any efficient user types of your own with the primitives Beam provides.


Elixir builds types out of maps (using the struct) and you can build multiple dispatch operations over types using Protocol. If you want to pretty print your user types, you can implement the Inspect protocol.


Yeah, the lack of user-defined types is definitely one of the biggest downsides of the BEAM.


For small fixed-length vectors, like 3D graphics, use tuples. They are contiguous in memory, fast to copy, and fixed O(1) time to access.



So what do you use real arrays for? For 80% of programming lists are just fine. Are you doing cpu-bound numerical calculations?


Interviewing for when the interviewing is expecting an array (sarcasm intended).

All kidding aside, unless I'm interviewing at an Elixir shop, I've learned that Elixir is a little too weird for interviewers who don't know Elixir very well.


The only time I've hit this is when implementing algorithms that assume constant time access and modification (a lot of algorithms assume this, and can't be swapped for index based maps easily)

This occurred somewhat recently when implementing stable topological sort. I ended up swapping List out for a Vector (Aja is library I used). I received a 100x performance improvement from it.

That's really the only time I've truly hit a List performance roadblock in 7+ years of Elixir.


Same things arrays are used for in all mainstream languages ie. index-based access.


You want tuples. They've got index-based access, and all your useful things in erlang:element/2, setelement/3, append_element/3, delete_element/2, insert_element/3.

With a big caveat that modifying tuples isn't great for performance unless the compiler or optimizer determine that mutating the tuple is acceptable rather than providing a mutated copy.


I understood tuples are only optimised for a handful of elements.


This is correct. If you want a true array experience you have to use atomics or a nif.


That's not answering the question. I've written many thousands of lines of elixir and never used index based access.


Good for you but why do all mainstream languages, including Clojure, have them?


Because they're built on mutable datatypes. Elixir (and Erlang, and some flavors of lisp) are not. Instead of fudging it to make it syntactically "look like" an array access, elixir makes you flip through the list to understand the cost of what you're doing.

Even clojure's is a cheat.


Then maybe that's a shortcoming of absolute immutability.


What would be the difference between a "real vector" and a Map with numeric keys?


Maps are hashtables, Vectors are a contiguous area of memory where each element can be access by index referring to a specific address in that memory.

Vectors are usually also homogeneous about the data they hold, because each element should occupy the same fixed amount of memory, such as i*item_size gievs you back the offset of the element i in memory.

Anyway, in Elixir you can use the Erlang's :array module


I'm pretty sure erlangs :array is just a skin over tuple, so it's ~O(1) but not contiguous. might be O(log n) for dynamic arrays. The only truly array datatype in Erlang is :atomics


Erlang tuples are contiguous in memory for primitive values https://blog.edfine.io/blog/2016/06/28/erlang-data-represent...


Array doesn't guarantee to be put into a single tuple. It might be nested.


> What would be the difference between a "real vector" and a Map with numeric keys?

Access in O(1) instead of O(nlogn).


Another thing with numeric maps is how insert would work. Swapping elements is trivial, but inserting a new one would require rewriting every subsequent entry.


Data is immutable in Erlang. How do you know the runtime does not optimize Maps with numeric keys into arrays?


Data is also immutable in Clojure but it still has vectors.


Yeah, but Clojure vectors are actually trees built to be persistent based on ideas from Bagwell's hash array mapped tries (what Clojure maps are), so they're not technically O(1).

Both maps and vectors in Clojure are trees, albeit very shallow trees (32-way branching). The difference lies in the interfaces and the lookup methods. (Maps hash keys and use bits to know which subtree to descend, while vectors use index bits.)



There is an array library which creates a MapArray type but with list semantics and conforms to Enum and Collectable protocols -https://hexdocs.pm/arrays/Arrays.html. I've found it useful for when I need an array. Then there's NX for vectors - needing extra deps, granted.


Vs which other languages - do you have examples of better patterns?


all the C family

all the JVM languages

all the .Net languages

Rust

Ruby

Python

etc.


Python lists are generally not contiguously allocated memory (I think it might be if you are storing integers less than 255), don't be fooled. That's why you need numpy.


I'm talking about Python arrays

https://docs.python.org/3/library/array.html

   >>> a = array('B', [1, 2, 3, 4, 5])
   >>> str(a.buffer_info()[1] * a.itemsize) + " bytes at address #" + str(a.buffer_info()[0])
   '5 bytes at address #4379640336'


   >>> b = array('l', [1, 2, 3, 4, 5])
   >>> str(b.buffer_info()[1] * b.itemsize) + " bytes at address #" + str(b.buffer_info()[0])
   '40 bytes at address #4380812752'


You can achieve a similar goal with the Akka framework on the JVM (functionally native with Scala). Uses the actor model and message passing for every concurrency goal achieved by Elixir. AND you can bundle in all your existing jars if the Java world matters to you.


Thanks. What's the Ruby example? It's the one I'm most familiar in.


   irb(main):002:0> ["GFG", "GFG", "GFG", "GFG"].class
   => Array
   irb(main):003:0> {1 => "CFG", 2 => "CFG"}.class
   => Hash
   irb(main):006:0> {1 => "CFG", 2 => "CFG"}.keys.class
   => Array
   irb(main):007:0> {1 => "CFG", 2 => "CFG"}.values
   => ["CFG", "CFG"]
   irb(main):008:0> {1 => "CFG", 2 => "CFG"}.values.class
   => Array


Totally agree with conradfr




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: