Good Environment for Data Modeling R&D

From: bullockbefriending bard
Subject: Good Environment for Data Modeling R&D
Date: Mon, 19 May 2008 12:43:47 +0000
Message-ID: <b7da025a-66dd-4fe5-a87a-184fde4ffe74@y22g2000prd.googlegroups.com>

Greetings Cantankerous Lisp Gods.

I stand before you as a reformed C++ coder (programmer would be
stretching it) who has arrived at this time and place via Ruby
(wheee!) and Python (sometimes have to get stuff done). I've now
started on SICP, have PAIP in the mail from Amazon and given another
few years might end up spending money with Franz... but it's a long
road out of the darkness.

OK, having got that incantation over and done with, I'm now going to
test your patience with a slightly OT question:

I work for a company which makes its money in an unusual kind of
market by doing market prediction. We're professional gamblers who bet
on this and that... mainly four legged animals. It's quite a good
business to be in if you know what you're about. A google for Ziemba
and Hausch should be enough to give a 10,000 ft view of what we are
about and how we do it (broadly speaking logit/probit models). Not a
bad business to be in if you're a fairly smart misfit as all the super
genius non-misfits are working in hedge funds and one can do OK in
certain betting markets.

Anyway, our current system is a mix of legacy Win32 stuff and some
more recent dot Net code + a RDBMS + some SAS stuff too.

At the end of the day, if it all works well in production, then the IT
side of the business is fine. This it does. No complaints there.

Where I see a problem is that this is nowhere near flexible enough for
our model R&D guys. They need to be able to 'play' with data and
quickly try  new ideas - preferably without having to talk to the IT
guys who don't understand any of this higher order stuff and think
that C# and Design Patterns are the Meaning of Life. Necessary and
good solid fellows for sure, but not into the standing on shoulders on
giants and seeing new things paradigm.

So, I guess what I am asking here is does anybody have experience in
exploratory work with large corpuses of data which updates regularly
and lives in a RDBMS backing store (at least master copy does)... and
how have you structured your work so that you get your stuff done
without breaking anything on the IT side, and how do you propagate
changes/developments back into the production system?

I am not really looking for specifics here, more general strategies
and overviews about how one goes about setting up and managing an R&D
playpen/sandpit so that R&D guys can access everything with total
autonomy.

Must be some of you guys doing smart stuff in your lisp images with
potentially huge impedance mismatches between your workspaces and
those of the guys who handle databases, production side, etc.

Yes, I know I should probably be posting this in some kind of data
mining group, but I have feeling that folks here are more likely to
get what I'm driving at. Also I am not really asking for various
proprietary solutions, more interested to hear about general
principles from people who have deal with this issue.

Five years down the track, I'd be happy if everything we were doing at
the sharp intellectual end of the business was a lot more functional
and possibly lispy... but this is a long road. Right now I'm just
trying to get my head around giving the research guys the right kind
of environment and freedoms.

Sorry for the OT posting, and would appreciate very much if anybody
has any enlightening observations!

Re: Good Environment for Data Modeling R&D Ken Tilton
- Re: Good Environment for Data Modeling R&D bullockbefriending bard
  - Re: Good Environment for Data Modeling R&D Ken Tilton
    - Re: Good Environment for Data Modeling R&D bullockbefriending bard
      - Re: Good Environment for Data Modeling R&D Ken Tilton
Re: Good Environment for Data Modeling R&D Rainer Joswig
Re: Good Environment for Data Modeling R&D Dmitriy Ivanov
Re: Good Environment for Data Modeling R&D Tim X
- Re: Good Environment for Data Modeling R&D bullockbefriending bard

From: Ken Tilton
Subject: Re: Good Environment for Data Modeling R&D
Date: Mon, 19 May 2008 18:31:27 +0000
Message-ID: <4831c77f$0$11612$607ed4bc@cv.net>

bullockbefriending bard wrote:
> Greetings Cantankerous Lisp Gods.
> 
> I stand before you as a reformed C++ coder (programmer would be
> stretching it) who has arrived at this time and place via Ruby
> (wheee!) and Python (sometimes have to get stuff done). I've now
> started on SICP, have PAIP in the mail from Amazon and given another
> few years might end up spending money with Franz... but it's a long
> road out of the darkness.
> 
> OK, having got that incantation over and done with, I'm now going to
> test your patience with a slightly OT question:
> 
> I work for a company which makes its money in an unusual kind of
> market by doing market prediction. We're professional gamblers who bet
> on this and that... mainly four legged animals. It's quite a good
> business to be in if you know what you're about. A google for Ziemba
> and Hausch should be enough to give a 10,000 ft view of what we are
> about and how we do it (broadly speaking logit/probit models). Not a
> bad business to be in if you're a fairly smart misfit as all the super
> genius non-misfits are working in hedge funds and one can do OK in
> certain betting markets.
> 
> Anyway, our current system is a mix of legacy Win32 stuff and some
> more recent dot Net code + a RDBMS + some SAS stuff too.
> 
> At the end of the day, if it all works well in production, then the IT
> side of the business is fine. This it does. No complaints there.
> 
> Where I see a problem is that this is nowhere near flexible enough for
> our model R&D guys. They need to be able to 'play' with data and
> quickly try  new ideas - preferably without having to talk to the IT
> guys who don't understand any of this higher order stuff and think
> that C# and Design Patterns are the Meaning of Life. Necessary and
> good solid fellows for sure, but not into the standing on shoulders on
> giants and seeing new things paradigm.
> 
> So, I guess what I am asking here is does anybody have experience in
> exploratory work with large corpuses of data which updates regularly
> and lives in a RDBMS backing store (at least master copy does)... 

No, but ignorance has never stopped me before: Have you looked at 
pulling the data into an RDF triple store where the R&D guys can get at 
it with Owl and Sparql and other fancy dan tools?

How you get that back into production is throw out RDBMS (that is so 
20th century) and swap in the RDF stuff. One tip: the folks in change 
control might want a heads up before you throw the switch.

kt

-- 
http://smuglispweeny.blogspot.com/
http://www.theoryyalgebra.com/
ECLM rant: 
http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
ECLM talk: 
http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en

From: bullockbefriending bard
Subject: Re: Good Environment for Data Modeling R&D
Date: Mon, 19 May 2008 19:13:34 +0000
Message-ID: <d26cc0d3-6c2c-4ad7-80f7-230b51f15d4f@y22g2000prd.googlegroups.com>

> No, but ignorance has never stopped me before:

I know the feeling.

> Have you looked at
> pulling the data into an RDF triple store where the R&D guys can get at
> it with Owl and Sparql and other fancy dan tools?

Whilst I would like to eventually get into the whole metadata
business, this won't happen next week. In our line, most of the data
is either concrete numerical historical or real time data, some of
which is processed for feature extraction before having multinomial
logit/probit regressions done against it.

> How you get that back into production is throw out RDBMS (that is so
> 20th century) and swap in the RDF stuff. One tip: the folks in change
> control might want a heads up before you throw the switch.

Right... and just before we warp into hyperspace, some VB weenie is
going to pop up and say "but where's the loop?". (I actually DID hear
someone (with a good amount of mathematical maturity, no less) say
that in about 1993 upon first seeing SQL. :)

Your idea is sound. No matter how we do things now, it is necessary
for us to get away from two things which stink of the 20th Century -
RDBMs, and C++/Java/C#-like programming paradigms. I know this. But I
have to hasten slowly. It's taking time to get our guys to see the
virtues of Python, let alone anything beyond that... Eventually we're
going to need pervasive semantic tagging or whatever in our data so
that we can 'Be Google' in what it is that we do. If we don't, well
someone else will and we will be blown away. I know this. But the IT
guys are back in 1996 swotting away at their certification exams and
think that there's a design pattern for every problem. The changes I
want to make will take time and will have to begin small. Even some of
our modeling guys don't quite get it. I could show them papers on
(say) Haskell probability monads (which might be a good thing in our
line of work) and they would say (you guessed it) 'but where's the
loop?' This is not to say that they are dummies, just that for non
programming language enthusiasts, these things are a bit obscure at
this point in history.

Anyway, I guess i should rephrase my original question a little...
part of what I'm looking for is a game plan for slowly dragging our
operations kicking and screaming into a brave new world where the
modelers can do a more high level functional style of programming +
feel free to create new building blocks / little languages... and at
same time I don't want to fry the brains of the IT back end VB/C#/
RDBMS boys who might run screaming for the hills if they had to stop
and think recursively for a minute instead of go look something up in
a patterns book or API library. Possibly I'm going to have to start by
pushing the R language for some of our exploratory statistical data
processing since it is at least a bit functional in nature. Down the
road I'd love to introduce a few guerilla apps written in F#, PLT
Scheme or CL, but right now I have to keep especially the parentheses
undercover... or they'll throw me in a padded cell.

I'm *certain* some of you must have faced these kinds of frustrations
in getting from A to B. So I'm looking for suggestions on how to go
about it in a graduated manner. Perhaps I should have made the
graduated manner part a bit more clear in my OP.

From: Ken Tilton
Subject: Re: Good Environment for Data Modeling R&D
Date: Mon, 19 May 2008 20:25:18 +0000
Message-ID: <4831e22f$0$25020$607ed4bc@cv.net>

bullockbefriending bard wrote:
>>No, but ignorance has never stopped me before:
> 
> 
> I know the feeling.
> 
> 
>>Have you looked at
>>pulling the data into an RDF triple store where the R&D guys can get at
>>it with Owl and Sparql and other fancy dan tools?
> 
> 
> Whilst I would like to eventually get into the whole metadata
> business, this won't happen next week. In our line, most of the data
> is either concrete numerical historical or real time data, some of
> which is processed for feature extraction before having multinomial
> logit/probit regressions done against it.
> 
> 
>>How you get that back into production is throw out RDBMS (that is so
>>20th century) and swap in the RDF stuff. One tip: the folks in change
>>control might want a heads up before you throw the switch.
> 
> 
> Right... and just before we warp into hyperspace, some VB weenie is
> going to pop up and say "but where's the loop?". (I actually DID hear
> someone (with a good amount of mathematical maturity, no less) say
> that in about 1993 upon first seeing SQL. :)
> 
> Your idea is sound. No matter how we do things now, it is necessary
> for us to get away from two things which stink of the 20th Century -
> RDBMs, and C++/Java/C#-like programming paradigms. I know this. But I
> have to hasten slowly. It's taking time to get our guys to see the
> virtues of Python, let alone anything beyond that... Eventually we're
> going to need pervasive semantic tagging or whatever in our data so
> that we can 'Be Google' in what it is that we do. If we don't, well
> someone else will and we will be blown away. I know this. But the IT
> guys are back in 1996 swotting away at their certification exams and
> think that there's a design pattern for every problem. The changes I
> want to make will take time and will have to begin small. Even some of
> our modeling guys don't quite get it. I could show them papers on
> (say) Haskell probability monads (which might be a good thing in our
> line of work) and they would say (you guessed it) 'but where's the
> loop?' This is not to say that they are dummies, just that for non
> programming language enthusiasts, these things are a bit obscure at
> this point in history.
> 
> Anyway, I guess i should rephrase my original question a little...
> part of what I'm looking for is a game plan for slowly dragging our
> operations kicking and screaming into a brave new world where the
> modelers can do a more high level functional style of programming +
> feel free to create new building blocks / little languages... and at
> same time I don't want to fry the brains of the IT back end VB/C#/
> RDBMS boys who might run screaming for the hills if they had to stop
> and think recursively for a minute instead of go look something up in
> a patterns book or API library. Possibly I'm going to have to start by
> pushing the R language for some of our exploratory statistical data
> processing since it is at least a bit functional in nature.

Sounds like a good first step since that is a bit of a standard -- these 
change fearing, teddy bear hugging, thumb sucking types (I might be the 
wrong internal consultant on this) love standards with lotsa books from 
O'Reilly.

I guess K would still count as weirdo-fringe?

> Down the
> road I'd love to introduce a few guerilla apps written in F#, PLT
> Scheme or CL, but right now I have to keep especially the parentheses
> undercover... or they'll throw me in a padded cell.

Did you say "Cells"? You should come to Lisp-NYC meeting where a guy 
kicking ass and taking numbers in a tall building with a homebrewed 
Cells-alike language regales us with tales of the non-trivial fraction 
of his time spent fighting off the Dark Side Forces of Stability and 
Convention trying to annihilate his group.

> 
> I'm *certain* some of you must have faced these kinds of frustrations
> in getting from A to B. So I'm looking for suggestions on how to go
> about it in a graduated manner. Perhaps I should have made the
> graduated manner part a bit more clear in my OP.

Oh. Should I have made clear that you are Doomed. Evolution happens when 
the obsolete DNA dies off, overtaken by fitter organism with better DNA, 
not by changing the DNA.

What about the Groovy project in Java? That might get a toe in the door 
for agile development. The one thing we all can see is "Thou shalt not 
have discontinuities in thine technological progression."

best of luck, kt

-- 
http://smuglispweeny.blogspot.com/
http://www.theoryyalgebra.com/
ECLM rant: 
http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
ECLM talk: 
http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en

From: bullockbefriending bard
Subject: Re: Good Environment for Data Modeling R&D
Date: Tue, 20 May 2008 13:18:18 +0000
Message-ID: <d816eff2-5db1-4258-9a30-8470bd286e69@d19g2000prm.googlegroups.com>

On May 20, 3:25 am, Ken Tilton <···········@optonline.net> wrote:
>
> I guess K would still count as weirdo-fringe?

We *did* have an APL old schooler from with an actuarial background on
board a while back, but he's in a padded cell now for the sin of being
ahead of his time.

I'm trying again in my own small way now.

From: Ken Tilton
Subject: Re: Good Environment for Data Modeling R&D
Date: Tue, 20 May 2008 15:36:11 +0000
Message-ID: <4832efeb$0$11638$607ed4bc@cv.net>

bullockbefriending bard wrote:
> On May 20, 3:25 am, Ken Tilton <···········@optonline.net> wrote:
> 
>>I guess K would still count as weirdo-fringe?
> 
> 
> We *did* have an APL old schooler from with an actuarial background on
> board a while back, but he's in a padded cell now for the sin of being
> ahead of his time.
> 
> I'm trying again in my own small way now.

Actually, Tim X's stuff (and the guy I mentioned in our drinking club) 
remind me that the K people actually ended up on our (well, I was a 
consultant) floor at UBS for quite a while (until UBS got merged and the 
new folks I think did not want K). Anyway, I then had the advantage of 
being a fly on the wall amongst the regular IT crowd when K came on 
board to this enormous fanfare, ten trumpets, a tuba, modest drum line.

I was curious tho skeptical over the line-noise, everyone else was 
pissed off by the claimed productivity. I learned from that and am 
successfully keeping Cells to myself with my incessant drumbeat -- 
holding back doc was disappointingly ineffective, anyone who looked got 
it pretty easily, I had to find a way to keep people from looking so I 
just kept saying how great it was. That behavior on the K side had 
everyone standing around making jokes and doing impersonations of the K 
people. I am not kidding.

So make a deal with management to try CL and then get them to play 
along: they relieve you of your line duties and assign you to "Special 
Projects", a known death knell. You ask your most likely detractors out 
for drinks, get drunk, say you are resigning, let them talk you out of 
it, the market for IT sucks, yadda, yadda. Meanwhile anyone who signs 
for the doomed Special Project plays along, gets transferred against 
their will, asks friends quietly for references, wears black arm bands 
or something, and spends alllllll their time talking about the idiotic 
parens.

Every success is like me and my buddy playing tennis. We have an 
unwritten: the rare amazing winner was a mishit, or accompanied by "I 
wish I could do that on purpose." A down the line unreachable? "I was 
actually going cross-court."

And make sure you use Cells, it makes programming an order of magnitude 
easier.

hth,kt

-- 
http://smuglispweeny.blogspot.com/
http://www.theoryyalgebra.com/
ECLM rant: 
http://video.google.com/videoplay?docid=-1331906677993764413&hl=en
ECLM talk: 
http://video.google.com/videoplay?docid=-9173722505157942928&q=&hl=en

From: Rainer Joswig
Subject: Re: Good Environment for Data Modeling R&D
Date: Mon, 19 May 2008 20:19:59 +0000
Message-ID: <joswig-D99A5A.22195819052008@news-europe.giganews.com>

In article 
<····································@y22g2000prd.googlegroups.com>,
 bullockbefriending bard <·········@gmail.com> wrote:

> Greetings Cantankerous Lisp Gods.
> 
> I stand before you as a reformed C++ coder (programmer would be
> stretching it) who has arrived at this time and place via Ruby
> (wheee!) and Python (sometimes have to get stuff done). I've now
> started on SICP, have PAIP in the mail from Amazon and given another
> few years might end up spending money with Franz... but it's a long
> road out of the darkness.

If I would be a sales guy from Franz, then I would want to
talk to you. Doesn't your phone ring? ;-) Seriously,
what you describe sounds a bit like a not so un-typical customer
for them.

> 
> OK, having got that incantation over and done with, I'm now going to
> test your patience with a slightly OT question:
> 
> I work for a company which makes its money in an unusual kind of
> market by doing market prediction. We're professional gamblers who bet
> on this and that... mainly four legged animals. It's quite a good
> business to be in if you know what you're about. A google for Ziemba
> and Hausch should be enough to give a 10,000 ft view of what we are
> about and how we do it (broadly speaking logit/probit models). Not a
> bad business to be in if you're a fairly smart misfit as all the super
> genius non-misfits are working in hedge funds and one can do OK in
> certain betting markets.
> 
> Anyway, our current system is a mix of legacy Win32 stuff and some
> more recent dot Net code + a RDBMS + some SAS stuff too.
> 
> At the end of the day, if it all works well in production, then the IT
> side of the business is fine. This it does. No complaints there.
> 
> Where I see a problem is that this is nowhere near flexible enough for
> our model R&D guys. They need to be able to 'play' with data and
> quickly try  new ideas - preferably without having to talk to the IT
> guys who don't understand any of this higher order stuff and think
> that C# and Design Patterns are the Meaning of Life. Necessary and
> good solid fellows for sure, but not into the standing on shoulders on
> giants and seeing new things paradigm.

There is lots of frustration about IT out there. It can be a barrier
to innovation. Where I have been it is all Java. It is super expensive
software creating a huge maintenance nightmare - the Cobol maintenance
problems will be nothing against that. I have the feeling that
the Java-based rule-based system I saw created could be written
by me alone in Lisp in the same time - instead by 15 people.
It was an application domain where Lisp just feels at home.

You will need new people. I feel that there is little chance
to do new things with existing people. One of the problems
that AI software had in the 80s was the low acceptance
of new software in IT. Many good solutions failed because
of low management or IT department acceptance. It also
helps if the new software really saves money and does something
critical. There are some Lisp systems that have never been able
to be replaced by some non-Lisp stuff. People were assuring
that really ancient software kept running - when more than
one replacement project failed. So you need a source of new
people - it also helps to get some technical knowhow so that
the first architectures are not written by newbies.

There have been some interesting projects in the past and
present. Some are very unconventional. For example the
older Connection Machines were just such playgrounds.
The CM2 was a massive parallel computer with 65536 small
processors and 4096 floating point units. Attached was a
fast (for that time) data storage (Data Vault) and a Lisp
system (one could then also get a SUN, ...) controlling the CM2.
You could imagine that you could do a lot of interesting stuff
on such a computer. The architecture break was radical. That's
also the main problem.

Others are running farms of inexpensive Lisp systems for
doing search or executing rule-based expert systems.
http://www.bizrules.info/page/art_amexaa.htm

Recently there was a talk at a Lisp meeting describing a solution
where the algorithms get compiled to reconfigurable hardware.
( http://www.hpcplatform.com/ ). 

> So, I guess what I am asking here is does anybody have experience in
> exploratory work with large corpuses of data which updates regularly
> and lives in a RDBMS backing store (at least master copy does)... and
> how have you structured your work so that you get your stuff done
> without breaking anything on the IT side, and how do you propagate
> changes/developments back into the production system?
> 
> I am not really looking for specifics here, more general strategies
> and overviews about how one goes about setting up and managing an R&D
> playpen/sandpit so that R&D guys can access everything with total
> autonomy.

I would just define a system and give the IT guys a spec
of the events I want to get. They have to figure out how to
deliver these events (depending on the type and amount of
events). If you need a channel back into their IT system let them
define a system that handles that. If you need more data,
get daily dumps from their data warehouse or what ever you need.
As long as you define the interface clearly to the IT, you
can live on your island. Accessing everything with total
autonomy is not possible - unless you (the department)
own that system. What you need is a Lisp infrastructure that
can talk to the IT infrastructure (for example using the EAI).
You might want to operate your Lisp/whatever infrastructure without
IT involvement (IT would slow everything down).

> Must be some of you guys doing smart stuff in your lisp images with
> potentially huge impedance mismatches between your workspaces and
> those of the guys who handle databases, production side, etc.

Sometimes these people use a 64bit Lisp system with huge memory
and load the data into the running Lisp system (which should be
stable enough to run for a very long time). Franz will tell
you more about their approach. Some Lisp implementations
run on shared-memory multi-core/processor machines. You
can use one of those to pump lots of data into it and get
the performance from such a system. You really want long-lived
processes with in-memory data or something that scales
along cheaper systems with more distributed processing.

> Yes, I know I should probably be posting this in some kind of data
> mining group, but I have feeling that folks here are more likely to
> get what I'm driving at. Also I am not really asking for various
> proprietary solutions, more interested to hear about general
> principles from people who have deal with this issue.
> 
> Five years down the track, I'd be happy if everything we were doing at
> the sharp intellectual end of the business was a lot more functional
> and possibly lispy... but this is a long road. Right now I'm just
> trying to get my head around giving the research guys the right kind
> of environment and freedoms.
> 
> Sorry for the OT posting, and would appreciate very much if anybody
> has any enlightening observations!

-- 
http://lispm.dyndns.org/

From: Dmitriy Ivanov
Subject: Re: Good Environment for Data Modeling R&D
Date: Tue, 20 May 2008 11:16:00 +0000
Message-ID: <g0ubur$47e$1@news.aha.ru>

bullockbefriending bard wrote on Mon, 19 May 2008 05:43:47 -0700 (PDT) 16:43:

bb> Greetings Cantankerous Lisp Gods.
bb> | ...snip...|
bb> Anyway, our current system is a mix of legacy Win32 stuff and some
bb> more recent dot Net code + a RDBMS + some SAS stuff too.
bb>
bb> At the end of the day, if it all works well in production, then the
bb> IT side of the business is fine. This it does. No complaints there.
bb>
bb> Where I see a problem is that this is nowhere near flexible enough
bb> for our model R&D guys. They need to be able to 'play' with data
bb> and quickly try  new ideas - preferably without having to talk to
bb> the IT guys who don't understand any of this higher order stuff and
bb> think that C# and Design Patterns are the Meaning of Life.
bb> Necessary and good solid fellows for sure, but not into the
bb> standing on shoulders on giants and seeing new things paradigm.
bb>
bb> So, I guess what I am asking here is does anybody have experience
bb> in exploratory work with large corpuses of data which updates
bb> regularly and lives in a RDBMS backing store (at least master copy
bb> does)... and how have you structured your work so that you get your
bb> stuff done without breaking anything on the IT side, and how do you
bb> propagate changes/developments back into the production system?
bb>
bb> I am not really looking for specifics here, more general strategies
bb> and overviews about how one goes about setting up and managing an
bb> R&D playpen/sandpit so that R&D guys can access everything with
bb> total autonomy.
bb>
bb> Must be some of you guys doing smart stuff in your lisp images with
bb> potentially huge impedance mismatches between your workspaces and
bb> those of the guys who handle databases, production side, etc.

As I do not feel comfortable enough discussing social and psychological
aspects in English, I only concern the technology ones.

To tackle the mismatch problem, object-relational mapping (ORM) could be of
use. According to it, you should separate high-level concepts (e.g. used
purely in R&D department) from the layer that implements fetching/storing
object-oriented data from/to an RDBMS (i.e. R&D to IT bridge).

The Java world has been elaborating this methodology for quite a while, e.g.
take a look at Hibernate.

In Lisp, the corresponding instruments could even be more flexible and
powerful. Owing to macrology and MOP, you do not need an additional language
to describe the methods of conversion.

Xanalys/LispWorks CommonSQL comes to mind as a starting point, though rather
preliminary. YSQL gives a bit more leverage but is still experimental
(excuse referencing myself :-)).
--
Sincerely,
Dmitriy Ivanov
lisp.ystok.ru

From: Tim X
Subject: Re: Good Environment for Data Modeling R&D
Date: Tue, 20 May 2008 09:50:50 +0000
Message-ID: <8763t9xq11.fsf@lion.rapttech.com.au>

bullockbefriending bard <·········@gmail.com> writes:

> Greetings Cantankerous Lisp Gods.
>
>
> Anyway, our current system is a mix of legacy Win32 stuff and some
> more recent dot Net code + a RDBMS + some SAS stuff too.
>
> At the end of the day, if it all works well in production, then the IT
> side of the business is fine. This it does. No complaints there.
>
> Where I see a problem is that this is nowhere near flexible enough for
> our model R&D guys. They need to be able to 'play' with data and
> quickly try  new ideas - preferably without having to talk to the IT
> guys who don't understand any of this higher order stuff and think
> that C# and Design Patterns are the Meaning of Life. Necessary and
> good solid fellows for sure, but not into the standing on shoulders on
> giants and seeing new things paradigm.
>
> So, I guess what I am asking here is does anybody have experience in
> exploratory work with large corpuses of data which updates regularly
> and lives in a RDBMS backing store (at least master copy does)... and
> how have you structured your work so that you get your stuff done
> without breaking anything on the IT side, and how do you propagate
> changes/developments back into the production system?
>
> I am not really looking for specifics here, more general strategies
> and overviews about how one goes about setting up and managing an R&D
> playpen/sandpit so that R&D guys can access everything with total
> autonomy.
>
> Must be some of you guys doing smart stuff in your lisp images with
> potentially huge impedance mismatches between your workspaces and
> those of the guys who handle databases, production side, etc.
>
> Yes, I know I should probably be posting this in some kind of data
> mining group, but I have feeling that folks here are more likely to
> get what I'm driving at. Also I am not really asking for various
> proprietary solutions, more interested to hear about general
> principles from people who have deal with this issue.
>
> Five years down the track, I'd be happy if everything we were doing at
> the sharp intellectual end of the business was a lot more functional
> and possibly lispy... but this is a long road. Right now I'm just
> trying to get my head around giving the research guys the right kind
> of environment and freedoms.
>
> Sorry for the OT posting, and would appreciate very much if anybody
> has any enlightening observations!
>

I started to write you a fairly long reply that outlined our use of
source code control, database refresh scripts and provision of three
environments to handle promotion of changes into production systems in a
reliable and reproducible way. Then I realised that really your problems
are less technical and more social/psychological. 

From your outline of the problems and description of the IT department,
I suspect you have an all too typical environment consisting of separate
'camps' who all believe they are the only ones who know what they are
doing and what is going on, all think they are under appreciated and all
think the other camps are cowboys, roadblocks or uncommunicative. In
reality, all are at fault to some degree. 

What you really need to do is address the social and psychological
dimensions.  You need to get the IT department feel they are an active
part of the changes you want to make and not just 'victims' of your wild
unconventional ideas. get them to 'buy in'. Tell them what the
problems/limitations are that you are facing and ask them for
suggestions. You don't necessarily have to follow them (though it helps
if you can present good arguments as to why you don't). Generally, you
will/can adopt some of their suggestions. Make it clear you want to find
solutions that will also make their life easier, not harder. Ensure
management knows when they do good things and try to let them know you
have done this (in a subtle way of course). 

A strategy I've used in the past is to ask key members of your IT
division to come to your staff meeting and do a presentation on what
they are currently doing, what their plans are for the future and what
they feel the future IT directions will be. This will make them feel
part of what is going on and appreciated. Asking them for suggestions to
some basic issues/problems not only can provide a good resolution, but
also makes them buy in and encourages them to help solve problems rather
than create roadblocks. ASk them what issues they have with how the R&D
area works and what they feel could be done to make their lives easier
etc. 

If you can get increased suport/help from the IT department or even if
you can just improve the working relationship somewhat, management will
see the benefits. Then you can start introducing new ideas. Start with
the R&D staff. Get them using CL (or whatever) to prototype ideas. Then
get them to implement them in whatever the accepted language/platform
is. Once you have some successes doing it this way, start to point out
how much faster, more efficient and *CHEAPER* it would be to skip the
whole re-development in the accepted platform and suggest a trial of
doing produciton work in the same language being used for
prototyping. Usually management doesn't really care about the technology
if they are getting sold concrete results over the medium and
long-term. However, be careful to choose soemthing you are very
confident will be a success for your first trial. A failure at this
stage will kill things forever. 

One warning, be careful you really are on top of the
language/technology. CL is a very powerful language and while it is easy
to learn, its a lot harder to master. I found this out the hard way. I
managed to get approval for a trial project using CL, but I was unable
to find enough truely proficient CL programmers. I had quite a few
applicants that knew CL, but in the end, they really didn't have a deep
knowledge of the language. Initial progress wa good, but as things got
more complex and we needed to improve efficiency and keep on top of
maintenance, the cracks befan to show. I think this was partially due to
the language skills and partially due to my own management
limitations. However, the end result was that while the project
completed and the system is still in use (as far as I know), it was not
considered to be the great success I had hoped for and I believe even
more moeny is now being put into replacing it (using Python!).

If you really did want a technical descrition of how we handle our dev,
testing and prod envrionments and how data is moved between them etc,
let me know via private e-mail and I'll outline it. 

Tim

-- 
tcross (at) rapttech dot com dot au

From: bullockbefriending bard
Subject: Re: Good Environment for Data Modeling R&D
Date: Tue, 20 May 2008 13:33:02 +0000
Message-ID: <16801898-4c04-4e5d-8dba-bba9752a72b8@k1g2000prb.googlegroups.com>

On May 20, 4:50 pm, Tim X <····@nospam.dev.null> wrote:
> bullockbefriending bard <·········@gmail.com> writes:
> > Greetings Cantankerous Lisp Gods.
>
> > Anyway, our current system is a mix of legacy Win32 stuff and some
> > more recent dot Net code + a RDBMS + some SAS stuff too.
>
> > At the end of the day, if it all works well in production, then the IT
> > side of the business is fine. This it does. No complaints there.
>
> > Where I see a problem is that this is nowhere near flexible enough for
> > our model R&D guys. They need to be able to 'play' with data and
> > quickly try  new ideas - preferably without having to talk to the IT
> > guys who don't understand any of this higher order stuff and think
> > that C# and Design Patterns are the Meaning of Life. Necessary and
> > good solid fellows for sure, but not into the standing on shoulders on
> > giants and seeing new things paradigm.
>
> > So, I guess what I am asking here is does anybody have experience in
> > exploratory work with large corpuses of data which updates regularly
> > and lives in a RDBMS backing store (at least master copy does)... and
> > how have you structured your work so that you get your stuff done
> > without breaking anything on the IT side, and how do you propagate
> > changes/developments back into the production system?
>
> > I am not really looking for specifics here, more general strategies
> > and overviews about how one goes about setting up and managing an R&D
> > playpen/sandpit so that R&D guys can access everything with total
> > autonomy.
>
> > Must be some of you guys doing smart stuff in your lisp images with
> > potentially huge impedance mismatches between your workspaces and
> > those of the guys who handle databases, production side, etc.
>
> > Yes, I know I should probably be posting this in some kind of data
> > mining group, but I have feeling that folks here are more likely to
> > get what I'm driving at. Also I am not really asking for various
> > proprietary solutions, more interested to hear about general
> > principles from people who have deal with this issue.
>
> > Five years down the track, I'd be happy if everything we were doing at
> > the sharp intellectual end of the business was a lot more functional
> > and possibly lispy... but this is a long road. Right now I'm just
> > trying to get my head around giving the research guys the right kind
> > of environment and freedoms.
>
> > Sorry for the OT posting, and would appreciate very much if anybody
> > has any enlightening observations!
>
> I started to write you a fairly long reply that outlined our use of
> source code control, database refresh scripts and provision of three
> environments to handle promotion of changes into production systems in a
> reliable and reproducible way. Then I realised that really your problems
> are less technical and more social/psychological.
>
> From your outline of the problems and description of the IT department,
> I suspect you have an all too typical environment consisting of separate
> 'camps' who all believe they are the only ones who know what they are
> doing and what is going on, all think they are under appreciated and all
> think the other camps are cowboys, roadblocks or uncommunicative. In
> reality, all are at fault to some degree.
>
> What you really need to do is address the social and psychological
> dimensions.  You need to get the IT department feel they are an active
> part of the changes you want to make and not just 'victims' of your wild
> unconventional ideas. get them to 'buy in'. Tell them what the
> problems/limitations are that you are facing and ask them for
> suggestions. You don't necessarily have to follow them (though it helps
> if you can present good arguments as to why you don't). Generally, you
> will/can adopt some of their suggestions. Make it clear you want to find
> solutions that will also make their life easier, not harder. Ensure
> management knows when they do good things and try to let them know you
> have done this (in a subtle way of course).
>
> A strategy I've used in the past is to ask key members of your IT
> division to come to your staff meeting and do a presentation on what
> they are currently doing, what their plans are for the future and what
> they feel the future IT directions will be. This will make them feel
> part of what is going on and appreciated. Asking them for suggestions to
> some basic issues/problems not only can provide a good resolution, but
> also makes them buy in and encourages them to help solve problems rather
> than create roadblocks. ASk them what issues they have with how the R&D
> area works and what they feel could be done to make their lives easier
> etc.
>
> If you can get increased suport/help from the IT department or even if
> you can just improve the working relationship somewhat, management will
> see the benefits. Then you can start introducing new ideas. Start with
> the R&D staff. Get them using CL (or whatever) to prototype ideas. Then
> get them to implement them in whatever the accepted language/platform
> is. Once you have some successes doing it this way, start to point out
> how much faster, more efficient and *CHEAPER* it would be to skip the
> whole re-development in the accepted platform and suggest a trial of
> doing produciton work in the same language being used for
> prototyping. Usually management doesn't really care about the technology
> if they are getting sold concrete results over the medium and
> long-term. However, be careful to choose soemthing you are very
> confident will be a success for your first trial. A failure at this
> stage will kill things forever.
>
> One warning, be careful you really are on top of the
> language/technology. CL is a very powerful language and while it is easy
> to learn, its a lot harder to master. I found this out the hard way. I
> managed to get approval for a trial project using CL, but I was unable
> to find enough truely proficient CL programmers. I had quite a few
> applicants that knew CL, but in the end, they really didn't have a deep
> knowledge of the language. Initial progress wa good, but as things got
> more complex and we needed to improve efficiency and keep on top of
> maintenance, the cracks befan to show. I think this was partially due to
> the language skills and partially due to my own management
> limitations. However, the end result was that while the project
> completed and the system is still in use (as far as I know), it was not
> considered to be the great success I had hoped for and I believe even
> more moeny is now being put into replacing it (using Python!).
>
> If you really did want a technical descrition of how we handle our dev,
> testing and prod envrionments and how data is moved between them etc,
> let me know via private e-mail and I'll outline it.
>
> Tim
>
> --
> tcross (at) rapttech dot com dot au

This is the kind of war story I was after. I've saved a copy of this
and I'm going to read it every day. Supposedly I know all this common
sense stuff already, but it's *much* better to see it coming from
somewhere else. Management, as you correctly surmise, is  not really
the problem... Management is just ignorant as Management tends to be.
It's the IT guys who are the minefield. They tend to feel paranoid
about being the meat in the sandwich at the best of times and I have
to try to introduce some changes without making them feel as if
they're now copping more from a new angle. Actually I couldn't give a
rat's ·@@ what they do in their department or how they do it as long
as they are happy and keeping the wheels of commerce oiled so that we
can bring in the dollars... no interest at all in impinging on their
territory or telling them how to do their job - they know the things
that they do far better than I do... but it's going to be a hard job
convincing them of this. It's also a bit hard to explain to them
precisely WHY the the R&D guys need to work in a different, more free
manner with different tools without raising the spectre of differing
levels of ability and personality types... so I hope they don't ask
too many questions.

Anyway, much food for thought here. Thanks!