Hi guys : I notice that my clojure code rapidly descends into "map hell" - where I have lots of maps with all sorts of hidden data inside of them.
How can I implement a robust application + data model in clojure ? For example, an application which models a semantically rich domain which would typically contain 5 to 6 relational tables.
Of course, I'm sure that you can use java beans etc using the clojure API, but I wanted to do things in an idiomatic way, and I want my data model to be implemented in as "functional" a way as possible, to keep in sync with the intrinsic power of the language .
Without seeing your code / model, it's hard to comment - could you elaborate on what sort of single domain entity requires half a dozen relational tables?
If you want Java-style "beans" in Clojure, you might look at defrecord but that really only gives you a map with a base set of named fields so I'm not sure it'll be "better" (and most Clojurians suggest working with regular maps where possible).
I'm wondering if the problem might be more related to the way you're structuring the code that manipulates your maps and so you're not seeing the benefits of decomposition and generic operations?
Joined: Aug 30, 2005
Yes im having trouble thinking in maps and lists.
I need to see how a semantically deep application
Can be built in clojure, and how map/list approaches compare to bean/data-type oriented approaches.
I don't see much difference between a Java bean with getFoo() / getBar() and a map with :foo and :bar attribute access so I guess I'm not sure where you're struggling - which is why I asked for more elaboration on the domain entity from half a dozen tables... If you can provide some more concrete examples, maybe I can point you in the right direction?
Joined: Aug 30, 2005
Well sean, I guess what I wanted to do was as follows :
I have an app with visualization, algorithms, databaes i/o, and data federation via rest. I wanted to "split" the modules up into
different submodules that different people can work on.
Normally, in java I might do this by writing some basic boiler plate java beans, and then writing interfaces for each module. Like for example.
Then, now that I know all my "Producers" can hook into my "Visualizers", I can start specifying implementation details, writing unit tests, and making nice object apis that make it easy
to write code on the fly in an IDE.
But I'm confused with the LISPy approach. I write something, like a map, and then I forget what was in the map... Then I send that map to another method, which takes a list, and that method crashes..... etc..... So I want to impose structure on my engineering process so that my code is self documenting and never gets out of control.
I'm not sure how to go about top-down design in the Clojure world ---- am I just thinking of things wrong ? Should I try to engineer my app from the bottom up ?
Every time I have a method with a multidimensional map of lists I cringe at the thought of "what if there is missing data in this map? what if there is a typo in the keys ? how will I know?"
The problem is that, I'm a bioinformatician - I have data structures which are complex semantically. For example, I want to suck a "protein structure" into clojure - which is composed of a "Structure" that has many "models", each of which has 100s of "amino acids", which have "atoms". Converting a data structure like this into a map would be scary --- I might forget about subtelties in my data model when inserting elements into the map..., due to the fact that Maps have no restrictions.
I firmly believe in maps and lists --- but how can I impose a "superstructure" on the methods in my clojure scripts so that they don't logically begin to deviate from one another as the source code base becomes larger ?
Splitting stuff up into sub-modules would correspond to functionality being divided into namespaces.
If you really know exactly what your APIs should be in advance, take a look at protocols. Otherwise evolve the API as a set of plain functions, at least to start with.
I don't think either of those have anything to do with "Java beans".
So let's look at the third piece of the puzzle: your complex nested data structures.
If your data structures are regular - Structure has-many Model has-many Amino-Acid has-many Atom - then you don't have a problem: just choose the right basic data structure at each level (map, vector, set, list) and you're good to go. You can map a Model-Function across a Structure, an Amino-Acid-Function across a Model, an Atom-Function across an Amino-Acid. That partitioning means you don't need to worry much about "types" - you have a natural cascade and any deviation is a logic error in your code (unit tests will help you here).
If the model isn't that regular, tag elements by using an enclosing map, either with a key matching the "type" or with a :data key and a :type key. You can either do a conditional off the type key (which would be nil if not the matching type) or you can dispatch off :type selection (using cond or multimethods).
Does that help?
Joined: Aug 30, 2005
Yes it helps... But I'm starting to wonder .... Will I miss being able to hit "." In my IDE and SEE all of my options?
Everytime I change my code--- will I have to go RTFM about which fields are in which map, etc? How do you maintain all of these nested models in your mind without an IDE that code completes your data structures for you?
Ah, I've worked with dynamic languages so long - and unsophisticated IDEs back when I learned Java originally - that it had never occurred to me that would be a problem...
I can think of two things that may help you:
First, you'll have a REPL open at all times (once you get into the "Lisp way" of developing) so it's easy to run (doc my-func) to see the docstring / arglists as well as being able to pull up real data and do (keys my-data) to refresh your memory about what's in a data structure. Also, as you build up that namespace, you'll encapsulate the structure of the data in the functions you create - only when editing those functions would you need to double-check the data's structure.
Second, you can write "getters" in the namespace related to that aspect of the data so that you can easily browse what public functions are available to operate on a given piece of data. I personally don't use such getters but you might find them helpful as you get started since it will feel more Java-ish.
By structuring your code appropriately, it should reflect the semantics of your data and you should find that you're not struggling to remember the data formats.
As an example at World Singles, we have a search-query map that contains :criteria which is a vector of individual criterion maps. Careful naming - of functions and arguments - allows us to see at a glance what we're working on and be sure of which functions are appropriate.
Part of Clojure's power - and the ability to create truly reusable code - comes from being able to treat data generically. The downside is a little bit more care is needed with naming and organizing your code to provide the clues that a strong type system otherwise offers. If you have a map, rather than a typed bean, you can apply any generic function that accepts a map - you no longer need to write a specialized version of iteration for each "type" of map (corresponding to typed beans).
Joined: Aug 30, 2005
What about a DSL ? Im surprised you havent mentioned that as a solution.
Wouldnt the careful use of prototypes and extension allow me to create a terse system of macros which was robust enough to model the domain in an abstract manner, so as to cover up the rickety underlying maps and list ?
Yes, given your domain model has a specific, well-defined vocabulary, I would actually expect your solution to evolve into a DSL over time - even if you didn't start out that way.
Prototypes and type extension may help. Later on, macros might help add syntactic sugar to make the code cleaner. You might also look at multimethods since those can dispatch off things like key/value data from maps (so you can dispatch based on "type" tags, for example).
It is generally considered better practice to provide functions first, then add macros only where they improve code clarity (macros are not as composable as functions). It's also usually recommended to start with the generic data structures and evolve your model over time, adding abstractions where they make sense - rather than trying to go all out and create a full, typed system up front.
So, start generically, evolve a DSL, refine the API with prototypes and/or multimethods, add syntactic sugar with macros where it makes the code easier to read.
The interactive nature of Clojure/Lisp development based on the REPL means you get a chance to try out several approaches really easily and gradually settle on what works best without going too far down any particular rabbit hole. For example, with one section of my application, I initially thought I needed mutable data and protocols but as I worked in the REPL exploring various possibilities, I was able to create an elegant solution that was entirely based on immutable data and used one multimethod function (with five dispatch points). It was a lot cleaner and more idiomatic than the initial direction I expected to take, but if I'd gone ahead and planned and coded up all of that initial direction, I would have had a lot of rework to do later on - or would have had to live with a more brittle, more complex solution.