• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

ML and Java

 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello everyone
I've been playing around with AI and ML Algorithms andmost of the tools and frameworks are built in Python !, Is there any Java library to load data and train model in java ?
If so please tell me about it *
 
Marshal
Posts: 4518
572
VSCode Eclipse IDE TypeScript Redhat MicroProfile Quarkus Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm not in to machine learning, but I listened to Adam Bien's podcast episode #169 where he talked with Zoran Sevarac about an interesting project named Deep Netts (link to MP3 of podcast).

There is an article on foojay here: Getting Started with Deep Learning in Java Using Deep Netts, and a 4 minute introduction video on YouTube here: Deep Netts Pitch.
 
Saloon Keeper
Posts: 27819
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Welcome to the Ranch, Ammar!

It's true. The favored platform for ML is Python, but there are Java options:

https://onix-systems.com/blog/top-10-java-machine-learning-tools-and-libraries
 
Bartender
Posts: 1359
39
IBM DB2 Netbeans IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Anyway,  for ML and AI I'd embrace Python without thinking twice.
Don't misunderstand me: I'm really a Java aficionado, but I think that it's better to use the right tool, and that for ML, the right tool is Python.
 
Bartender
Posts: 268
12
IntelliJ IDE Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Claude Moore wrote:Anyway,  for ML and AI I'd embrace Python without thinking twice.
Don't misunderstand me: I'm really a Java aficionado, but I think that it's better to use the right tool, and that for ML, the right tool is Python.



Digging around old threads a little... you seem to be well informed on this stuff--here's a question.

I know Python is the clear #1 for ML, but... why? Is there a GOOD reason for this?

Is it just because it was the popular language used by researchers, and it kind of snowballed since "all the other ML stuff is in Python"? Or is there some other reason? I also wonder why ML went this way, but crypto (another similar bleeding edge field) went mostly towards a high performance language.
 
Saloon Keeper
Posts: 15555
364
  • Likes 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Lou Hamers wrote:Is it just because it was the popular language used by researchers, and it kind of snowballed since "all the other ML stuff is in Python"?


Pretty much. Nobody is going to reinvent ML algorithms.

I also wonder why ML went this way, but crypto (another similar bleeding edge field) went mostly towards a high performance language.


Most languages have pretty decent standard libraries for cryptography. That means people can use whatever language they want to build their newest scam coin implementation on top of. On top of that, cryptography appears deceptively simple to beginners in the field, so its more tempting to reinvent stuff in your language of choice.

There's also a bit of culture involved. Not many people immediately think of performance when they think about ML. For crypto it's different. When talking about crypto algorithms, one of the first things you'll hear people mention is speed. If your implementation is not the fastest one available, people just don't want to touch it.
 
Lou Hamers
Bartender
Posts: 268
12
IntelliJ IDE Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Makes sense, but I hope whatever's written in Python gets implemented in other languages eventually.

If that's how it works, we'd have tons of code written in horrible languages like JavaScri... ah, crap...
 
Claude Moore
Bartender
Posts: 1359
39
IBM DB2 Netbeans IDE Spring Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Lou Hamers wrote:Makes sense, but I hope whatever's written in Python gets implemented in other languages eventually.



Python quickly become the de facto language for ML for two main reasons: first, it's a simple yet effective language, and it doesn't require a long training to learn as much as you need to approach ML, which is essentially math.
Second, scikit-learn library is a excellent library for scientific calculations, and its' effective also in contexts rather different from ML: it's gained a widespread adoption in many research fields. Moreover, using tools like Jupyter Noteboook, you can quickly experiment and prototype a working solution, that you may want to run on Google Colab, for example.
Please note that I'm talking about DOING ML, not about USING ML for somewhat project: using poweful AI / ML Models we all know - like OpenAI for example or Gemini - is a task you can accomplish with whatever language you want, as long as your preferred language can execute REST API calls.
 
Tim Holloway
Saloon Keeper
Posts: 27819
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The thing about engines like TensorFlow is that they are essentially "black boxes" that can be implemented on a wide variety of platforms, including graphics cards and custom AI logic processors, but the basic control is high-level and can easily be done via a scripting language.

Command-shell scripting is non-portable (you can't run a PowerShell script under ksh!) and often cryptic. JavaScript is ugly and requires a complex environment to run. Visual Basic is not only vendor-specific, but practically dead now, Perl is "write-only" and in my experience, prone to library breakage. So in the non-compiled world, Python is left as one of the few popular alternatives. It doesn't hurt that Python support is virtually guaranteed on Linux - Red Hat distros would have a hard time doing boot-time hardware configuration without Anaconda.

Java has many virtues, but it's a fair amount of work to set up a Java project. And the extra overhead to edit, compile, debug, repeat just to run simple engine directives isn't worth the effort. Likewise for C/C++, Rust, and so forth. Stuff like Ruby offers no real advantage either, and it's not as popular as it used to be. Almost nobody runx ReXX since the Amiga died, either, as another example of a platform with potential that missed the bus.

Python has its problems, but they mostly don't come into play when doing ML/AI. So that's where the most work has been done.
 
Lou Hamers
Bartender
Posts: 268
12
IntelliJ IDE Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is probably my own familiarity speaking, but I do think it's worth the small overhead Java brings (eventually). Compiling isn't so painful, and compared to an interpreted language I would think in the long run it wins out as things grow in complexity.

I remember reading somewhere that often the "research/science people" will do an initial implemention of a thing in Python, then find "IT'S TOO SLOW!", and bring the (non-research/science) Java people to build a faster or more efficient version of the thing. (Presumably after most of the major experimentation is over with.) I don't know if it'd be so much slower to do the initial work in Java in the first place, but that process seems a little wasteful.

Tim Holloway wrote:...prone to library breakage. So in the non-compiled world, Python is left as one of the few popular alternatives. It doesn't hurt that Python support is virtually guaranteed on Linux...



This stood out to me. My experience using Python sounds a lot like that Perl issue. The dev environment and dependency stuff just comes across as amateur hour compared to Java's much more mature system. If it's done correctly, a cloned Gradle or Maven project will pretty much always "just work" as long as your JRE version is compatible (trivial to resolve that with SDKMAN if needed).

So in Python they've built stuff like "anaconda" (a multi-GB beast to install!!! and they call Java bloated?) and I know there's some other "virtual environment" stuff to deal with dependency conflicts and so on, but that comes across to me as overkill and a pain (maybe it had to be because they couldn't do it a better way).

In my experience Linux makes it even worse if you do everything at the system level, because many Linux distros have Python installed for a reason--the distro is using it! If you start messing with it at the system level to get your Python dev stuff working, you can mess up your OS.
 
Tim Holloway
Saloon Keeper
Posts: 27819
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'll address this in reverse.

First, you can have multiple versions of Python installed in Linux, just as with Java. In fact, for a while, until Python 2 was killed off, having both Python 2 and Python 3 was a regular thing. It helped with the migration process by not dumping you from one to the other a lĂ  Visual Basic.

Secondly, Anaconda goes way back on Red Hat and Linux has a history of booting faster than Windows. Anaconda may actually be less bloated these days, since you no longer have to detect and configure as many different network interface, video, and disk controller types in this more standardized age. I'm not an anaconda expert, but I'm sure there are compelling reasons why such a critical process wasn't done in a compiled language. Red Hat has - at least before IBM ate them - never been part of the "Just Git 'er Dun!" software school and they've had plenty of time to convert had it been considered necessary. Of course, rebooting Linux is a lot less frequent than rebooting Windows…

The libraries I can pull in via pip have rarely given me problems, unlike CPAN. The pip utility is sort of like Maven that way.

And finally, as I said, you don't really "program" ML systems, you train them. The extra time and effort to edit/compile/debug versus simply edit-and-go is an unneccesary annoyance. The high-processor-usage is done in the "black box" and I neither know nor care whether that box is Python code, C code, or an FPGA co-processor.

I am a rabid fan of Java, or I wouldn't be such a perpetual nuisance here on the Ranch. But some things just aren't that great a fit for Java. I didn't do my recipe webapp in Python, but the IoT board I previously mentioned isn't powerful enough to run Java. It can, however, run CircuitPython and TensorFlow Lite.

Another hot platform for ML is the Raspberry Pi. I can attest that the Pi has been able to run Java historically, but the unit I had running my CNC machine had horrible performance using a Java GRBL interpreter and I was forced to replace it with a native-code app. JVMs are so hungry that the Amdahl/V6 mainframe I used to jockey back in the mid 1980's couldn't even begin to support a single JVM. So ML apps on the Pi are typically done in Python, because, again, no matter what the AI box is written in, the end users don't want to muck around with low-level code when they can do some simple Python instead. Since ML is less about programming than it is about training, a lot of people with no programming aptitude to speak of can get involved with it.




 
Lou Hamers
Bartender
Posts: 268
12
IntelliJ IDE Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can only have one 'python' at the system CLI level, but yes I don't think that's a big issue really, Java has the same issue. The nuisance I'm thinking of is all the 'pip install etc' commands that a lot of open source stuff tells you to run to get their thing working. I've had situations where I had a version of something installed needed for one thing, and needed another version of the same package for something other project.

The only way (that I know of.. I'm no Python expert) to resolve that without breaking your previous setup is to use some kind of virtual environment... assuming I'm not dead wrong about that, that's a horrible experience. In Java we don't "install" our dependencies, so it's just a better system. I don't know why Python devs (apparently) feel this method of dependency management is acceptable at all.

Tim Holloway wrote:
And finally, as I said, you don't really "program" ML systems, you train them. The extra time and effort to edit/compile/debug versus simply edit-and-go is an unneccesary annoyance. The high-processor-usage is done in the "black box" and I neither know nor care whether that box is Python code, C code, or an FPGA co-processor.



I've done some ML training experiments before, so I get that, but apparently I'm just not grasping something about the edit/compile/debug "effort". Or maybe we're talking about different things. If we're just training or using a model, I'd agree the language it's built in isn't important (as long as it's not a nightmare to get it running, which it can be with Python in my experience!).

What I'm thinking of (something I haven't done at all yet) is actually coding/scripting the "pipeline" and shipping the data around from stage to stage. Other than the fact that Python has more libraries to call on, I'm not seeing how Java has a major disadvantage there (admittedly the former lack of native libraries disadvantage is a significant one).
 
Lou Hamers
Bartender
Posts: 268
12
IntelliJ IDE Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:
Another hot platform for ML is the Raspberry Pi.



I wonder how well a minimized/custom JRE (Java 9+) would do here. Given that they went to all that effort causing everyone big headaches during the process of creating JPMS, I would hope it makes Java competitive running in a low resource environment. At least I thought that was a main goal there!
 
Tim Holloway
Saloon Keeper
Posts: 27819
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Lou Hamers wrote:You can only have one 'python' at the system CLI level, but yes I don't think that's a big issue really, Java has the same issue. The nuisance I'm thinking of is all the 'pip install etc' commands that a lot of open source stuff tells you to run to get their thing working. I've had situations where I had a version of something installed needed for one thing, and needed another version of the same package for something other project.


That's not the case for either language. The easy way during Python2/Python3 days would be to invoke by name "python3". But the alternatives system, loathe it though I do, allows fine-tuning of that behavior. In both cases, a PATH override can select a particular version on a per-shell basis.

For libraries, it's a bit different. Unlike a Java Maven repo, installed Python libraries aren't versioned (which is curious, since Linux and Unix do support versioned OS libraries!) But there are mechanisms.

First and foremost, depending on how you pip-install, your selected library can go into the common system library, your own local account library or a local project. Most serious Python developers do, in fact, recommend setting up a Python virtualenv, and it shows how Python is a secondary language to me that I don't think I've ever actually done that. I don't have enough Python projects around to have had version conflicts and for that matter, have had relatively little breakage putting projects developed on the latest Fedora into CentOS 7.

Incidentally, when you put a wheel together to make an app pip-installable, it can define dependencies with versions. So there's that.

Bottom line, though, is that even though I'm no Python wizard, I really haven't had that much trouble with that sort of stuff. Which is good. Like Alan Kay said: "Simple things should be simple and complex things should be possible".

Lou Hamers wrote:
What I'm thinking of (something I haven't done at all yet) is actually coding/scripting the "pipeline" and shipping the data around from stage to stage. Other than the fact that Python has more libraries to call on, I'm not seeing how Java has a major disadvantage there (admittedly the former lack of native libraries disadvantage is a significant one).



My point of view is that I'd be more likely to just glue together microservices, in which case, what language or even what machine a given pipeline stage is in isn't really that important.

On thing for good or ill (and I've seen both), however, is that a scripting system can be edited in-place, whereas a compiled production system often entails more red tape and of course, building a deployable module. Since ML is more of a seat-of-the-pants sort of deal, and as mentioned, not everyone in it is going to be competent in IDEs, build tools and the like, it's hardly surprising that an interpreted language is preferable. While I sneer that the idea that programmers are going to be obsolete soon (I've been hearing that since the 1970s), there are some things that simply don't need the extensive software development expertise that something like an Amazon ordering system does.

Lou Hamers wrote:
I wonder how well a minimized/custom JRE (Java 9+) would do here. Given that they went to all that effort causing everyone big headaches during the process of creating JPMS, I would hope it makes Java competitive running in a low resource environment. At least I thought that was a main goal there!


We tried that with JavaME. It didn't go well. I know. I used to moderate the JavaME forum. Granted, Android's Dalvek is also stripped down, but my suspicion was that the biggest performance liability on the Pi was that its JVM didn't JIT as aggressively as the mainstream ports. As far as stripping it down, recall that the Pi was originally pitched as a "real" computer for the impoverished masses (been there/did that and I was pretty broke when I bought my first Pi, actually). A real computer deserves real software, and there are actually few apps in the Debian family that don't have a Pi package. And that's often as much due to failure to implement as it is lack of capability. It's ironic that I was able to emulate an IBM System/370 (probably more powerful than the original hardware), but there wasn't a 3270 terminal emulator app for the Pi. Not because it's complicated. Just because no one had bothered to port it.
 
Lou Hamers
Bartender
Posts: 268
12
IntelliJ IDE Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Holloway wrote:
That's not the case for either language. The easy way during Python2/Python3 days would be to invoke by name "python3". But the alternatives system, loathe it though I do, allows fine-tuning of that behavior. In both cases, a PATH override can select a particular version on a per-shell basis.


Yep, we're crossing wires a little - I agree. I meant we can only have 'python' (or 'java') soft link to a single version. The 'python3' cheat is fine enough (however now we have a bit of a mess since 2 was finally EOL'd and I presume we'll eventually want to drop the 3 in the command at some point). But these are pretty minor problems and I manage it for Java easily by using SDKMAN on Linux.

I'm totally good with Python dependencies not being managed natively by Linux too. At least on Debian that's not always happy-fun-time ("packages held back" - ugh, annoying). Not that Python's tools do much better of a job, but at least we have the virtual environment option.

From now on if I need to do anything at all with Python, even hello world, I'm setting up some type of virtual environment. lol
Burned once is enough!
 
Tim Holloway
Saloon Keeper
Posts: 27819
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There's a bit of a difference here. Java, in its "proper" forum is entirely defined within a single directory tree. However, OpenJDK and certain others do various tricks to make it appear as though the components are LSB-compliant. In my case, I define a JAVA_HOME, and put $JAVA_HOME/bin into my PATH and skip by the LSB, because I developed habits that predate that approach and besides, quite a few tools need a JAVA_HOME defined.

Python doesn't work that way. And incidentally, I think all my major machines, excepting the CentOS 7 ones run "/usr/bin/python" as a Python 3. Since Python has never as far as I know been anything but LSB-based, the libraries are separated by Python versions without regard to library versions. If you want a different default Python, use the alternatives system. That's what it's designed for.

I'm always a little leery of components that can be installed by either system package manager or product-specific installer.  Afraid I'll get them crossed. Although Python has enough internal versioning that I think it can deal with that.
 
Hey, check out my mega multi devastator cannon. It's wicked. It makes this tiny ad look weak:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic