SitePoint Sponsor

User Tag List

Page 2 of 4 FirstFirst 1234 LastLast
Results 26 to 50 of 87
  1. #26
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by otnemem
    It is not only usefull but almost non-negotiable for the end-user and I'm OK with that, but what is really worrying me is parsing those kind of formats. I hope I'm not mistaken but that's going to be the hard part.
    There are actually a lot of options. Firstly we are only making slight modifications, we don't need to parse the entire file. Secondly, on those machines without the COM extension (if we needed it), RTF/XML could be used instead.

    The real fun begins later on. What happens when we start reporting on project progress (a nice side effect)? Interfacing with M$ project anyone? There is a real possibility that this would be a pilot project for something that would be ported to .NET or Java.

    Before anything like that though, it needs to be demonstrated working. RTF should be fine.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  2. #27
    SitePoint Wizard
    Join Date
    Aug 2004
    Location
    California
    Posts
    1,672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It certainly is the hard part. I have parsed word processor documents in the past and that is why I recommended using a rich textarea. I found that it was better to copy/paste and rehighlight than try to make sense of word processor markup. Maybe if you focus only on EOL and BOLD you can get it to work. Bulleted lists will be tricky. But you will run into cases where you get markup for your example "They log in to their account." like:

    "They {b}log{/b} {b}in{/b} to their {b}ac{/b}{b}count.{/b}

    When you say, "A programmer and client sit down next to each other with the requirements document..." why specify a word processor? Wouldn't you have a web connection 99% of the time?

    And things like the following seem pretty scary: "The requirements docs. can be uploaded and downloaded by everyone in the project. Both developers and business people can edit stuff at certain times, and a record and history is kept of all of this." Besides the diffuculty of it, isn't this what web applications are meant to avoid? Escaping the MS Office mentality and going browser centric is the goal for making stuff like this work.

    I am very interested in this project, but still think it would be much more powerful if you could control the input. At a minimum you need to specify what the target markup format is (i.e. the result of importing a word processor document). Best would be an XML/HTML format that a rich textarea could handle. Then you could:

    1. Import word processor documents if necessary or start straight in a browser.

    2. If the target format is specified you can get started on other parts of the system without being dependent on the importer (if it ever works).

    3. Allow the programmer to reedit the document (after meeting with the client) to add more meaning. This would improve requirements, test and code generation while not wasting the clients time.

    4. Easier control who can edit what, tracking changes, etc.

    5. Simplify things for non-technical users so they don't have to upload/download to do anything.

  3. #28
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by arborint
    It certainly is the hard part.
    There are several hard parts. I can see the document management system balooning too. However initially we have to get something working.

    Quote Originally Posted by arborint
    When you say, "A programmer and client sit down next to each other with the requirements document..." why specify a word processor? Wouldn't you have a web connection 99% of the time?
    That's only one situation. Likely the documents will be half written before the meeting. Even more so once the client understands what is happening.

    Documents have a lot of advantages. You can e-mail them back and forth and having the client have more confidence in the tool than the developer encourages them to edit the document themselves. You can also embed arbitrary content such as spreadsheets. It prints too.

    Wiki based systems mean that the client has to sit quietly while the developer types things in and then the client needs to ratify it. That's not what I want. I want the client to write the specs., but with the developer asking for clarification in the form of tests, and the client getting feedback when they pass.

    Quote Originally Posted by arborint
    Besides the diffuculty of it, isn't this what web applications are meant to avoid? Escaping the MS Office mentality and going browser centric is the goal for making stuff like this work.
    I am not trying to convert the world to web applications. I am trying to communicate with the clients in their language. I have spent about four months gathering information for this project, it hasn't come out of thin air. They use Word. I don't like it, you don't like it, but they do so we have to deal with it.

    Quote Originally Posted by arborint
    5. Simplify things for non-technical users so they don't have to upload/download to do anything.
    Then we'll have it optionally work by e-mail. This was actually an earlier idea, but developer side seem to think it is too restrictive.

    Sorry, I just not going to bend on this right now. After the first alpha or so, we can start to open up the project for other features (such as direct web input). I see the most important thing as getting something out there in use at the earliest possible date. For that to work the tool must be immediately accessable.

    yours, Marcus

    p.s. I couldn't get "phat" freed up as an SF name and couldn't come up with a decent fancy acronym. It is going into SF as "arbiter" .

    <crap_justification>Afterall it's role is client and developer mediation.</crap_justification>
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  4. #29
    SitePoint Wizard
    Join Date
    Aug 2004
    Location
    California
    Posts
    1,672
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I understood clearly the direction you wanted to go after my first post. I'm not trying to talk you out of it.

    I don't think arguments like the following really make any difference whether it the content is edited in an application or a browser:
    - You can e-mail them back and forth
    - having the client have more confidence in the tool
    - encourages them to edit the document themselves.
    - You can also embed arbitrary content such as spreadsheets.
    - It prints too.
    - the client has to sit quietly while the developer types things in

    I think my main point, perhaps lost in my post, was that you should specify an intermediate format for you documents, probably in XML. You can easily generate word processor documents or PDFs from it, and it is a target to import documents to it. It modularized your system. You can validate it. etc. etc.

    If your goal is " getting something out there in use at the earliest possible date." you wil get there much faster starting with a XML standard that you can even hand code initially.

    Still a very interesting project.

  5. #30
    SitePoint Member jepa's Avatar
    Join Date
    Aug 2003
    Location
    WA
    Posts
    5
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think you said the name in your last post:

    Sourceforge name: Client and Developer Mediator (CDM)


    <edit>
    or maybe:

    PHp Acceptance Mediator (PHAM)
    </edit>

    Jepa

  6. #31
    Resident Java Hater
    Join Date
    Jul 2004
    Location
    Gerodieville Central, UK
    Posts
    446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by lastcraft
    Hi...



    The plan is that it doesn't matter what I think .

    Although I am "dictating" the first alpha version, it's vital thet it becomes user/community driven or it won't work. In SimpleTest I had a rule that if two people asked for a feature then I would queue the feature even if I didn't believe in it. Every time I found myself using the very feature that I would have rejected on my own .

    I think that this rule will one day be appropriate for this project to.

    yours, Marcus
    Oh, well if ever I need a feature in anything you develop I'll now know for future reference to send a similar email under and aliased email address

  7. #32
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by arborint
    I think my main point, perhaps lost in my post, was that you should specify an intermediate format for you documents, probably in XML.
    Ah, OK, I misunderstood. That's more of an internal issue, so I'll leave that up to the developers.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  8. #33
    Resident Java Hater
    Join Date
    Jul 2004
    Location
    Gerodieville Central, UK
    Posts
    446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    As far as general Office support goes, I think COM is the way forward. It provides a very safe way to interface a Office document due to the versioning model. Generally newer versions of Office will have wrappers for older COM interfaces found in older versions of Office, likewise I can't see the interface changing too much. Directly trying to parse a binary file written in off ice would be unsafe due to the fact Microsoft for some pointless reason might decide to make the file format non backward compatible.

    The drawback with COM is it is slow as $*!T. I guess isn't too bad as people won't be uploading word docs in thje same as as visitors access a high traffic portal. I would assume that COM is safe enough to use in PHP from a relieablity issue.

    What I do think is a very good thing is to basically have support for the HTML file format. This would be easy to support as SimpleTest has an XML parser, likewise there's HTML SAX, and all the new PHP5 stuff. The other reason why I say this is we can use COM to basically export Word into a HTML file. This would save us farting about with COM, trying to find out things like text formatting via COM. Word's HTML output is can be imported back into Office safely thanks to the excessive amount of pointless XML markup and other junk the file gets polluted with. From a coding perpective, a Word HTML parser wouldn't be much different from parsing a normal HTML document. It also would make a a very useful by product, that is a a script that cleans up Word HTML files

    I think other Office Suites are also have ways to export as HTML. If I remember rightly Open Office or K Office are trying to push for some universal XML based file format. This is another reason to go down the XML/HTML path and simply use COM to 'export as HTML'.

    Like Marcus, I do agree we stay clear of the Javascript RTE editor. As programmers we are used to finding ways round things, and working round problems, it's out job FFS! However the average business user / client will not have the time or technical skills to be doing this. Remember your everyday business client is amazing thick when it comes to working a computer, the last thing you want to do is introduce unneeded learning curves. Working a lot with the design side of web work as well as programming, (along with experience tutoring ppl in packages like photoshop, etc etc), putting time into making good user interfaces makes a world of difference. Businesses out there are willing to pay a lot more for something that is easy to use than train people as training has the hidden cost of time. Look at Windows, it's technically a piece of poo dangling from a rotting stick, but businesses still use use it (despite the huge cost now associated with maintaining it due to security flaws).

    It's not *that* hard to work with Office. As I've pointed out the COM export / XML+HTML approach to things should prove to be a fairly straight forward route, which offers long term scalablity as it can easily be adapted to other XML based formats. Also Excel support saving to a nasty table based HTML format.

    If you want to look at simplifying things from the upload / download perspective, why not look into WebDAV. I have not idea how this fully works but it sounds like this is a suitable protocol for the job. I also like the idea of email. These sort of ideas sound fun, and will make the project stand out from the normal web interface people are used to. Currently I'm looking into working on some a backend that allows some sort of XML based communication (normally via Flash or something) to allow a PHP script to generate a 3D scene and render it back out to the browser. Harry's research into XMLHTTPRequest certainly ties in with this and I think project like this put developers at the cutting edge. PHP needs these sort of projects to make it stand out. There are too many crappy projects out there that are PostNuke in a reworked fashion.

    -- Jason

  9. #34
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    I know absolutely nothing about WebDAV . Anyone have a quick idiots explanation for me? The reason for going for RTF was to add the COM depndency as an option later. I use a Unix box at home so I'll be playing with old versions of OpenOffice.

    The HTML format might be a go'er if it can be kept pretty hidden from the users most of the time and nothing gets lost (e.g. images, embedded objects) during the transfer. All of this needs investigation.

    And yes I think it is at least a little "cutting edge". It makes a change from porting ideas from Java .

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  10. #35
    SitePoint Zealot
    Join Date
    Aug 2003
    Location
    Brisbane, QLD
    Posts
    101
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    webdav is basically an extension of HTTP which allows manipulation of remote files. there's an apache module available which is quite stable for 1.3.x, and it's shipped with apache 2.0.

    also windows is pretty into webdav (frontpage extensions are kinda a *******ized version i believe i read somewhere), so it's something that should be somewhat easily available...

  11. #36
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Bogota
    Posts
    101
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by MiiJaySung
    ... PHP needs these sort of projects to make it stand out. There are too many crappy projects out there that are PostNuke in a reworked fashion.

    -- Jason
    Totally agree. I think we can start with the COM-to-HTML/XML idea. This will help to push alternatives to COM that are frienlier with non-MS platforms.

    I also want to add that we could be moving towards a better interoperability for PHP with projects like this.

    When can I start coding?

    - Andres
    If I have wings, why am I walking?

  12. #37
    SitePoint Zealot
    Join Date
    Jun 2004
    Location
    Bogota
    Posts
    101
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by lastcraft
    And yes I think it is at least a little "cutting edge". It makes a change from porting ideas from Java .
    Indeed.

    PHP needs this
    If I have wings, why am I walking?

  13. #38
    SitePoint Guru OfficeOfTheLaw's Avatar
    Join Date
    Apr 2004
    Location
    Quincy
    Posts
    636
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You know, I wonder if doc could be generated easier using the new xml format in the latest versions of word?

    James Carr, Software Engineer


    assertEquals(newXPJob, you.ask(officeOfTheLaw));

  14. #39
    SitePoint Addict been's Avatar
    Join Date
    May 2002
    Location
    Gent, Belgium
    Posts
    284
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    First, respect to Marcus for taking on such a project!
    I'd like to help, but since I cannot really imagine how I could be of any help, I guess I'm not meant to participate (if someone thinks of something, let me know, plz, it would be nice to give something in return for SimpleTest)
    I'll just let some ideas out, maybe there's something usefull in it...

    The client is central, I think Marcus is spot on about the Word thing, not that my 'career' is so extensive or anything, but very rarely I've seen documents from clients in anything other than Word .doc
    I think it's allmost essential that the clients should be free to write the documents in the tool of their liking.
    I do think this will need specialized parsers, but at the start, word doc, txt, xhtml will cover the most part.

    Would it be too much to just ask the client to "save as text" in their Word ?

    Maybe another way would be a sort of office plugin, which would be a button on the toolbar (for example in Word). Clients wouldn't have to email anything just type the document and press the button...
    This has to be installed of course, which would be one of the main arguments against I guess.

    Cvs server as a central document repository, documents stored as xml.
    Automated import and export of files to and from the repository, including transformation to desired end format (specialized parsers on import, xslt on export?)

    The night before, I wasn't that sober anymore I must admit, I thought "WYSIWYG" was a very 'deep' name for the project
    Per
    Everything
    works on a PowerPoint slide

  15. #40
    SitePoint Enthusiast DmS's Avatar
    Join Date
    Jan 2004
    Location
    Stockholm, Sweden
    Posts
    94
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi there.
    Respect => lastcraft!

    As a "former client for 10 years then turned developer" kind of guy i can tell you that this would have been a godsend app in a mutitude of situations!

    A year or so back I made a simple webtool where you can add a question package (for instance a set of requirements) with weighted questions and a 1-5 answer for degree of fulfillment of the requirement plus comments for each Q.
    You can set a degree of % that has to be reached in order for a requirement to be tagged as green (fulfilled) and you get an answer out of it on each question, aggregated and accumulated.

    You can test your product against the same questionpack as many times as you like and then show a history to see if things improve or not.

    It's rudimentary and you have to test and answer manually at this pont but I have used it in accessabillity analisys for clients and the results where well recieved from the clients.

    As for the dreaded office format, yup, like it or not, you MUST support it to get a tool like this in use by regular users, no question about it.

    I remembered some talk about this on a CMS list and scrounged up this link for an ongoing project for Tyop3 that might help a bit further on.
    http://typo3.org/uploads/media/TYPO3...mentsSuite.pdf

    And again, superb project, I just whish I had 24 hours more each day...
    /Dan
    { knowledge is what remains once you forget what you learned }
    Home: DmSProject Tutorials: GurusNetwork
    Committed at:
    OzoneAsylum + Blog

  16. #41
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by been
    First, respect to Marcus for taking on such a project!
    I'd like to help, but since I cannot really imagine how I could be of any help,...
    You need to score yes in any two of these...

    1) Can code OO in PHP?
    2) Can write a SimpleTest test suite?
    3) Have a copy of Word or close other?
    4) Have a web server on your home box to test with?
    5) Have any good ideas on parsing word processor documents?
    6) Have even a modicum of enthusiasm for the project?
    7) Occasionally have to do requirements gathering?
    8) Are a likeable guy/gall?

    ...and that's it.

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  17. #42
    SitePoint Member jepa's Avatar
    Join Date
    Aug 2003
    Location
    WA
    Posts
    5
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I found a project at sourceforge that convert word files into many different formats. I have not had alot of time to look into it, but maybe it has a dictionary or a parser that we can use as a starting point for our php parser. It is written in C.

    wvWare

    Jepa

  18. #43
    Non-Member
    Join Date
    Jan 2004
    Location
    Planet Earth
    Posts
    1,764
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I am likeable

  19. #44
    SitePoint Enthusiast Refresh's Avatar
    Join Date
    Jul 2004
    Location
    Lausanne, Switzerland
    Posts
    46
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by arborint
    When you say, "A programmer and client sit down next to each other with the requirements document..." why specify a word processor? Wouldn't you have a web connection 99% of the time?
    NO! =) I wish my clients did, though

    @Marcus: count me in, even if I won't have lots of time to dedicate...

    By the way, it's not hard to ask the clients not to forget to convert the document to RTF just before saving IMHO, but true, we really want to minimize hassle for the client...

    I PMed you my SF id.

    Cheers!

  20. #45
    SitePoint Enthusiast Refresh's Avatar
    Join Date
    Jul 2004
    Location
    Lausanne, Switzerland
    Posts
    46
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by jepa
    I found a project at sourceforge that convert word files into many different formats. I have not had alot of time to look into it, but maybe it has a dictionary or a parser that we can use as a starting point for our php parser. It is written in C.

    wvWare

    Jepa

    HTMLArea has a (presumably well tested) word text cleaning feature, which we could maybe rip?

  21. #46
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by Refresh
    I PMed you my SF id.
    Still waiting for the project setup .

    yours, Marcus
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  22. #47
    Resident Java Hater
    Join Date
    Jul 2004
    Location
    Gerodieville Central, UK
    Posts
    446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Widow Maker
    I am likeable
    That's debateable I think :-P

  23. #48
    Resident Java Hater
    Join Date
    Jul 2004
    Location
    Gerodieville Central, UK
    Posts
    446
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Refresh
    HTMLArea has a (presumably well tested) word text cleaning feature, which we could maybe rip?
    I looked at that. All it does it use a regexp to strip things like <o> or whatever (basically because all word tags have a set namespace it's easy to do it with a regexp.

    I don't like the idea of that, I found it normally looses a large degree of formatting. Marcus has a very good lexer with an XML parser class that will parse the word HTML (or XML if you dare say it is XML), which will probably be needed for a few operations, we can strip out word XML using this as it will be a more scalable way forward.

    As far as I see, support for .DOC is and maybe .XLS is very important. However we mustn't underestimate other formats that other programs use. All the major Office programs out there seem to have an option to save in some XML based format out there, and using COM we can convert Office to HTML/XML format. Like Widow Maker I think an intermediate format is needed. The problem comes with RTF as it's not XML based. However if the format converter classes follow some set of interfaces we can do this nice and cleanly, it just won't use an XML parser / DOM like system. We can still use the Lexer Marcus made to parse RTF's though.

    Another way forward would be to look into XSL if various office packages can save in an XML based format. I have little knowledge on this front to say how viable this is.

    I think we have moreless agreed on how we can deal with different office file formats (well as far as roughly how we tackle the problem), what other bits do we need to think about before we start planning implementations or end up repeating ourselves with the topic of file formats ???

    -- Jason

  24. #49
    ********* Victim lastcraft's Avatar
    Join Date
    Apr 2003
    Location
    London
    Posts
    2,423
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hi...

    Quote Originally Posted by MiiJaySung
    ...what other bits do we need to think about before we start planning implementations or end up repeating ourselves with the topic of file formats ???
    If you want to get off to a quick start while I struggle with Sourceforge on your behalf...

    For running the project first:

    Have a look at Scrum, http://www.controlchaos.com/ and also the TODO list system in SimpleTest. I intend to borrow "load factor", unit testing, acceptance testing, shared code ownership and refactoring from XP and take the "backlog", burn down and monthly cycle from Scrum.

    On what I am trying to get at:

    For the requirements and glossary side of things you might want to look at Domain Driven Design by Eric Evans. I also make a lot of use of the Aggregate pattern in my code these days, but that's an aside.

    The only requirements tool with integrated testing so far is FIT http://fit.c2.com/, developed into Fitnesse http://fitnesse.org/. These tools work better for XP teams with an integrated customer within the team.

    Components:

    I have included a rough UML sketch. It's a bit dated already and certainly not gospel. I'll check it into CVS when I actually can.

    yours, Marcus
    Attached Images Attached Images
    Marcus Baker
    Testing: SimpleTest, Cgreen, Fakemail
    Other: Phemto dependency injector
    Books: PHP in Action, 97 things

  25. #50
    SitePoint Enthusiast NativeMind's Avatar
    Join Date
    Aug 2003
    Location
    USA
    Posts
    80
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I do requirements engineering for a living in a very large enterprise R&D organization...

    There are many high end requirements engineering tools, like IBM's Rational line of products that integrates with Rational Rose, and several other products.

    The most basic thing you can do is write a word document. Usually you have a standard template to write your document, and that's really only so you remember to address all the important areas: install/upgrade, backwards compatibility, performance/capacity, init and recovery, security, maintenance/servicing, reporting/auditing, etc.

    Most often you need numbered requirements, usually coupled with a scenario - and even better, an architecture description. As soon as you get numbered requirements, it's easy to put them in a list and relate them to other deliverables of a project. Also, requirements usually get tagged with a few attributes: release, status (accepted, or draft), source (who came up with it, or who wants this), and possibly other category type attributes.

    My bliss would be a system where you can take a requirement and trace it from requirement -> architecture description -> design [document section] -> code delivery -> test case, and get status on each link in the chain. So you can say, we have implemented 48% of the requirements, and have a 80% of the tests have passed.

    A wiki is a good start, but it would need some explicit ways to do traceability of requirements.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •