aspose file tools
The moose likes Other Open Source Projects and the fly likes [POI apache] How to get specific fields in Word Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Products » Other Open Source Projects
Bookmark "[POI apache] How to get specific fields in Word" Watch "[POI apache] How to get specific fields in Word" New topic

[POI apache] How to get specific fields in Word

david simon

Joined: Mar 03, 2014
Posts: 3

I'm trying to get data from specific fields in a Word document.

So I'm opening a doc for the user, which already has text fields, the user can only fill up the fields.

My problem is that I'm using bookmarks to get the data with POI apache, it works fine, but the problem with bookmarks is that the user can or cannot see them depending on his Word configuration (it is a problem if he can see them). The other issue is that he can write outside of the bookmarks and still be inside the fields, so in the end I won't be able to get the data he has writen.

My problem is mainly with Word, I'm not an expert and I'd like to know if I can use something else than bookmarks to do what I need to do or if i'm doing something wrong.

Thank you in advance for any help you could provide me.
Paul Clapham

Joined: Oct 14, 2005
Posts: 19092

Hi David, welcome to the Ranch!

I see this post hasn't had an answer for quite a while. I'm not sure I have anything useful to say, but my feeling is that Word isn't the right tool for getting a user to provide you with specific information. And from reading your post, I think you get that same feeling too. It's just far too easy for the user to mess up your template document.

Problem is, I don't have a good alternative to suggest. I know that it's possible to make a PDF where you can provide fields for the user to enter data, having used such PDFs to claim on my health-care plan and so on. But then that moves the problem to how are you going to extract those fields from the PDF. At work we used Excel to get structured input from users, and that was better than Word because of the built-in grid of cells, but it was still possible for the users to mess up the document.
david simon

Joined: Mar 03, 2014
Posts: 3
Hi, thank you !

Well I've been asked to try to do it because that's what the users want. There's already a feature from a software in my company that exists, it's a form in sharepoint that looks like the document and that is generated as a Word or PDF document then.

At first the users wanted a new version of the software to be able to change the document (background, styles, position of forms...) but it seems pretty difficult to do that in sharepoint (so they say, sharepoint is not one of my skills). So I'm being asked to make a proof of concept in Java because it's the main langage we use, but even if reading the document is pretty hard, it's possible, same for the styles.

I've found something I can possibly work with, it's called content control, it's a sort of tag that can be named and when the document is locked it disappears and looks like a field so it seems that I may be able to use this. Now I need to be able to retrieve the specific fields with POI apache, but it's not that easy. If I succeed I'll probably be able to meet their requierements, else they'll find another sharepoint solution.

As for your proposal, well if it was possible I'm sure they wouldn't matter but I'm not even sure if reading in PDF is possible, and the problem with Excel is that we don't want the users to mess with the fields too much, else the software won't be able to read it properly. The advantage with word is that by locking the document they'll be able to only edit the forms and won't mess with the document itself.

Well I'm stuck with this for now, I'll try my best with content controls for now, but I'm still open to other solutions.
I agree. Here's the link:
subject: [POI apache] How to get specific fields in Word