Another neat machine

In my current upgrading of my local net, and in anticipation of my new ISP business account, which gives me a /28 segment of public IPs (16 adresses, of which 13 useable), I had to get VLAN capable switches to replace the cheapo Netgear Gigabit dumb switches I had. That way I can place my router/firewall anywhere I want without pulling a truckload of cables everywhere. I can also dedicate a public IP to a segment leading directly to a victim machine or virtual machine across a VLAN, for malware studies, and other little experiments.

After a lot of soulsearching and getting up to speed on lightly managed L2 switches, I settled for two HP Procurve 1810G with 24 ports each. I’ll probably get another 1810G 8 port unit, too.

So, I put one in the office and the other in the back room. First and foremost, these little buggers are fan-free. No moving parts. Lifetime warranty, low power (8 W or so). The totally silent part was my absolutely major requirement.
DSC04392
This unit allows setting up using a browser and has trunking, VLANs, measurements and not least, monitoring ports. That is, I can hook into any other port and send that output through a selectable monitoring port. Ideal for sniffing on whatever port you desire.
Another totally unexpected boon was that I was able to read the entire manual and learn it all. This is the first time in maybe ten years I’ve ever been able to learn all the features of a non-trivial piece of equipment. And that feels so good.

Oh, and I discovered that OSX Snow Leopard, both server and client, has a super simple graphic UI for setting up virtual interfaces matching VLANs. All I need now is a router/firewall with a couple of connectors, a number of zones, and ability to match zones to interfaces and VLANs.

Subversion server on Snow Leopard server

As I already bragged about, I got me one of those delicious little OSX Mini Snow Leopard Server boxes. So sweet you could kiss it. I just got everything together to make it run a subversion server through Apache, too, and as a way to document that process, I could just as well make a post out of it. Then I can find it again later for my own needs.

First of all, subversion server is already a part of the OSX Snow Leopard distribution, so there is no need to go get it anywhere. Mine seems to be version 1.6.5, according to svnadmin. Out of the box, however, apache is not enabled to connect to subversion, so that needs to be fixed.

We’ll start by editing the httpd.conf for apache to load the SVN module. You’ll find the file at:

/etc/apache2/httpd.conf

Uncomment the line:

#LoadModule dav_svn_module libexec/apache2/mod_dav_svn.so

Somewhere close to the end of the file, add the following line:

Include "/private/etc/apache2/extra/httpd-svn.conf"

Now we need to create that httpd-svn.conf file. If you don’t have the “extra” dir, make it, then open the empty file and add in:

<Location /svn>
  DAV svn
  SVNParentPath /usr/local/svn
  AuthType Basic
  AuthName "Subversion Repository"
  AuthUserFile /private/etc/apache2/extra/svn-auth-file
  Require valid-user
</Location>

Save and exit. Then create the password file and add the first password by:

sudo htpasswd -c svn-auth-file username

…where “username” is your username, of course. You’ll be prompted for the desired password. You can add more passwords with the same command, while dropping the -c switch.

Time to create svn folders and repository. Create /usr/local/svn. Then create your first repository by:

svnadmin create firstrep

Since apache is going to access this, the owner should be apache. Do that:

sudo chown -R www firstrep

Through Server Admin, stop and restart Web service. Check if no errors appear. Then use your fav SVN client to check if things work. Normally, you’d be able to adress your subversion repository using:

http://yourserver/svn/firstrep

Finally, don’t forget to use your SVN client to create two folders in the repository, namely “trunk” and “tags”. Your project should end up under “trunk”.

Once up and running, this repository works perfectly with Panic’s Coda, which is totally able to completely source code control an entire website. If you don’t know Coda, it’s a website editor of the text editor kind, not much fancy graphic tools, but it does help with stylesheets and stuff. It’s for the hands-on developer, you could say.

The way you manage a site in Coda is that you have a local copy of your site, typically a load of PHP files, which are version controlled against the subversion repository, then you upload the files to the production server. Coda keeps track of both the repository server and the production server for each site. The one feature that is missing is a simple way of having staged servers, that is uploading to a test server, and only once in a while copy it all up to the production server. But that can be considered a bit outside of the primary task of the Coda editor, of course.

You could say that if your site isn’t mission critical, but more of the 200 visitors a month kind, you can work directly against the production server, especially since rolling back and undoing changes is pretty slick using the Coda/subversion combo. But it does require good discipline, good nerves, and a site you don’t really, truly need for your livelihood. You can break it pretty bad and jumble up your versions, I expect. Plus, don’t forget, the database structure and contents aren’t any part of your version control if you don’t take special steps to accomplish that.

Coda doesn’t let you access all the functionality of subversion. As far as I can determine, it doesn’t have provisions for tag and branch, for instance. But it does have comparisons, rollbacks and most of the rest. The easiest way to do tagging would be through the command line. Or possibly by using a GUI SVN client, there are several for OSX. I’m just in the process of testing the SynchroSVN client. Looks pretty capable, but not all that cheap.

The cutest little muscle machine ever

I got me that brand new Apple Mini with Snow Leopard OSX Server unlimited edition included. This is such an adorable machine, you wouldn’t believe it. It has everything you can wish for in a server, as far as I can make out after just a couple of hours with it. It’s super easy to set up and to monitor. It’s small, it’s beautiful, it’s almost totally noiseless, and seems to use hardly any power. When you feel the case, it’s just barely warmer than the environment and the same goes for the power supply. When I switch off everything else in the room, I can only hear the server running from less than a meter’s distance. It seems to produce about the same noise level my 13″ white MacBook does when it’s just started and perfectly cool. In other words, practically inaudible. Still, it’s running two 500 Gb drives in there, which I’ve set up as a mirrored (Raid 1) set.

I’ll probably brag about this system some more once I get to know it better. But meanwhile, it’s the nicest computer purchasing experience I’ve ever had. Except for the Mac Pro. And the MacBook. And the iMac, of course. And the iPhone. And Apple TV.

server_dimensions_20091020

More on evidence based

This is a continuation on my previous post, “Evidence based vs anecdotal“.

I wrote an email to the main author of the chapter in “4th Paradigm”, Michael Gillam, and he graciously responded to my criticism by agreeing to everything I said and emphasizing that this is what they wanted to say in that chapter. He suggests that it may not have been clear enough in that regard, and I agree. Anyhow, it’s great to know that smart people are indeed having the right idea of how to handle the knowledge that appears in IT systems, often as a side benefit of having extensive amounts of data in them.

What Michael stresses instead is the benefit of having a real-time monitoring of the performance of treatments in the population. He points to the Vioxx debacle and how a lot less people would have been subjected to the increased risk of myocardial infarctions, had the systems been able to signal the pattern in large data sets. And in this he’s entirely correct, too.

So, in conclusion, we’re in total agreement. A problem remains, however, in that even I, the archetypal skeptic, was easily mislead to read the chapter in question as promoting the discovery of new treatment regimes from dynamic electronic health care data. And I think that is exactly what is happening when some new ill-conceived projects are started in health care. I’ve seen an increasing tendency to dream up projects based on just this, the idea that large sets of health care data will allow our electronic health care record systems to recommend for or against treatments based on the large accumulated set of experience data in the system. And I think the reason is that people like us that reason about how to handle that data and what to use it for and what not to use it for, don’t realize that snippets of our conversations taken out of context, may lead decision makers to take catastrophically wrong turns when investing in new projects. At least, that’s what seems to happen. Time for an anecdote from real life. (Note the strangely ironic twist, whereby I use an anecdote to illustrate why anecdotal knowledge is a bad thing.)

This is an entirely true story, happened to me about 30 years ago, while I was doing my residency in surgery. A trauma patient, comatose, with multiple rib fractures and abdominal trauma, in respiratory distress, was wheeled into the emergency room. I asked the male nurse to administer oxygen through a mask and bladder, while the blood gases got done. As the blood gas results came back, I stood a couple of meters away and quietly explained the blood gas results to an intern, saying something like “see how the oxygen saturation is way down, he’s shunting, while the carbon dioxide is lowered under the normal value, which you may find strange, but it’s because he’s compensating with hyperventilation”, etc. After a minute of explanations, I look up and I see that the nurse is holding the rubber mask over the patient as I ordered, but I see no oxygen line connected, so I tell him it fell off. He says “No, I took it off. You said he’s hyperventilating, so he should re-breathe without any oxygen.” OMG… this guy was actively suffocating the patient after overhearing one word of a half-whispered conversation and applying the only piece of knowledge he possessed that was associated with that word. Which was entirely wrong, as it turns out.

Admittedly, this particular nurse wasn’t the sharpest knife in the drawer; he did this kind of thing with frightening regularity. But still, this illustrates quite perfectly, in my opinion, what politicians and technicians are doing with health care related projects. They catch snippets of conversations, apply some wishful thinking, and formulate a thoroughly sexy project that in their opinion will revolutionize medicine. Except it’s all based on a fundamental misunderstanding. We have to become very much more clear in our discussions about exactly what we can use electronic health care records data for, and what we absolutely must not use it for. Yes, we can use it to provide warning signals to epidemiologists and pharmacologists, and ideas for future studies on new phenomena, but we can definitely not use it to make direct recommendations for or against treatments to doctors while they handle patients. The only recommendations that should be presented to them, are recommendations based on thoroughly and correctly performed studies, nothing else.

It’s up to us to see to it that the people in power get the entire conversation, and understand it, before we let them start projects that have the potential to destroy the advances in medical knowledge we have today. They’re entirely capable of suffocating this particular patient in the name of sexy IT health care projects.

Evidence based vs anecdotal

I’m increasingly disturbed by a very backward tendency to implement bad science in healthcare IT systems. More and more often, I read about initiatives to mine electronic healthcare records for data and build some kind of knowledgebase from this, then use it to support clinical decision making. It sure sounds sexy from a technical standpoint, but it’s so wrong.

We used to have anecdotal medicine, or experience-based medicine if you prefer, where each doctor largely learned from his own patients, mistakes, and successes. This led to a lot of wrong conclusions, since outcomes are multifactorial. That is, there are a bunch of reasons why any particular case goes right or goes wrong, and you can’t control for those reasons if you learn from cases after the fact.

Then we decided to only advance medical science on properly designed, prospective, and controlled clinical studies, which seems to be the only way to get anywhere in the long run. So that’s what we should do.

The reason I posted this today is that I just read something horrifying in an otherwise excellent book (which you can get for free here), the “4th Paradigm”, Microsoft Press. This is an excerpt:

…current trends toward universal electronic healthcare records mean that a large proportion of the global population will soon have records of their health available in a digital form. This will constitute in aggregate a dataset of a size and complexity rivaling those of neuroscience. Here we find parallel challenges and opportunities. Buchan, Winn, and Bishop apply novel machine learning techniques to this vast body of healthcare data to automate the selection of therapies that have the most desirable outcome. Technologies such as these will be needed if we are to realize the world of the “Healthcare Singularity,” in which the collective experience of human healthcare is used to inform clinical best practice at the speed of computation.

No, please don’t destroy medical science like this…

Need for push

A number of Swedish media sites are down right now, newspapers and stuff, due to a DDoS attack of some kind. Now, this is serious. News sites are at the core of a free and open society.

This got me thinking about how to solve DoS in general and there are ways. I’d suggest two mechanisms.

1. Move from a pull model to a push model for subscribed web content. Push can be done from any old place, so there’s nothing for the attackers do DoS. I’d imagine the client to have a front end or proxy that checks for the right digital signatures to allow content in. The bad guys can still DoS the clients, but with very little return on investment. Not so surprisingly, we don’t have the required technologies in place, but there’s an abundance of components already in existence for such a system, so it should be straightforward to assemble.

2. For those services that can’t be done with push, use a smarter client that is able to go look for services according to preset algorithms or using a form of dynamic DNS. IOW, move the load balancer to the client side instead of the server side. (I’ve done this, it works.) This won’t eliminate a DoS entirely, but will make it orders of magnitude more difficult.

The problem here is that there is no incentive for the large hosting players to do anything that diminishes the need for giant pipes and huge data centers. So we can’t count on them to help out.

Useless email limitation

Something just happened here in old Sweden. A doctor sent an email with confidential patient info to a local government office, but fatfingered the adresses, so it ended up with 200 different people at that government office. Problem was, except for the numbers, that the patient he was divulging info about, actually works at that office as well. Embarrassing, to put it mildly. Now they’re discussing what disciplinary measures to apply for fatfingering the destinations.

But the problem here isn’t that he fatfingered the adresses, the problem is that he used email at all. Except that seems to be established practice here. I don’t, btw. I stick to envelopes or encrypted fax.

I got an email account at the provincial healthcare system where I work, but I can’t get at that email account from the outside. I found that pretty dumb. After reading about this case, I changed my mind. Now I find it totally moronic. Allowing me to access it only from inside the provincial healthcare network gives me the impression that it is somehow a local and safe medium, which it is not. I’m perfectly able to send out any confidential information to absolutely anyone in the world, using this system, intentionally or otherwise. The only thing the access restriction actually prevents is… um… normal use?

To be fair, there is the hypothetical danger of someone hacking into my email account from the outside, to get at confidential information that someone else may have sent me and that I haven’t, for some reason, deleted, but compared to the danger of me actively sending out information by mistake to the wrong people, like a mailing list or a group adress, it’s negligible. No egress filtering is in place that I know of.

There is one useful solution to all this, namely a messaging feature in the electronic health care record system, since that automatically limits distribution to other authorized users of the system itself. But in our case, that function disappeared when they changed out our old system for a new and “improved” one.

In conclusion, I’ll claim that limiting outside access to the mail system like this is an illconsidered and useless move, more likely than not to be counterproductive.

What a strange piece

Can’t help commenting on this opinion piece on TechRepublic: The Apple Tablet will disrupt Apple’s device momentum. Just a few choice quotes:

… 1) providing good entertainment; and 2) providing flexible input and output. The success of the iPhone illustrates how correct I was in the first point. It stumbled into the entertainment angle.

Yes, sure, Apple “stumbled” into success.

In fact, every key innovation in PC technology for the past 30 years has been driven by gamers, but Microsoft, HP, and everyone else – including Apple with its largely accidental success with the iPhone – have ignored this in the portable electronic device market. And the whole time, I feel like I’ve been jumping up and down, screaming, “Deliver the games, become the dominant games delivery system – and the rest will follow!”

You didn’t scream loud enough, did you? Oh, btw, me and a lot of other people I know buy iPhones for other reasons than games, like for their business and connectivity apps, you know? Haven’t heard of those yet? Maybe if you stopped jumping up and down and screaming for a sec, you’d hear us talking about that.

“Take my word: The super-hyped Apple Tablet – which is supposed to be a convergence of the iPhone, the Apple Mac, and the netbook phenomenon – is going to be a failure”

What Apple Tablet? Nobody has seen one, but you already know what it’s going to be and even why it will ultimately fail.

Why will the Apple Tablet fail? The one thing consumers don’t want is another gadget that ultimately does the exact same thing as several other gadgets they already own, especially one that requires all kinds of contortions to move legally-licensed and legitimately-owned content around from device to device.

Yes, especially as Apple is known for making their users buy media again and again every time another product of theirs comes on the market. Like the last time people bought all their media on the Play-For-Sure system only to discover that it wouldn’t play on the new generation Apple Zune. We’re not going to walk into that trap again, are we? When will Apple finally learn to treat their customers with respect, like Microsoft does with their iTunes, which is not only DRM free but works across devices without new purchases being required.

Oh, wait, did I get that backwards?

Then I just skimmed through the rest, but if you feel like being abused by more of that bad thinking, please head over there and enjoy some more of these “insights”.

What’s up with Snow Leopard and file sizes?

Yes, I know Snow Leopard changed the way they calculate file and volume sizes, but what I’m seeing here is too weird to be explained by that. I’ve got a few image files in a folder on my desktop and the filesizes I’m seeing with ls -al is:

583

Now watch the png file sizes when I look at it using Finder:

584

Oops… WTF was that??! A display bug! Let’s try again after juggling the column widths so the selection bar straightens itself out again:

585

Just to be sure, I opened up the info panel on the first file:

586

Yes, truly, here it says 109,207 bytes while ls -al says 15,843 bytes for the same file. And yes, I’ve checked and double checked and triple checked, I do indeed look at the same file. Doing a spotlight search also only returns one image. Uploading the image to a webserver and checking through Transmit shows the 15k size. Here it is, the file, from a webserver: http://vard-it.com/images/20091019/interaktioner1.png, so you can check for yourself.

So why is Finder reporting a size value seven times larger?

Update a little later: yes, I used the ls -al@ to find the resource fork and that is what is making the difference. Maybe Finder should have the option of showing that separately at least in the inspector? Maybe I should read the man pages before posting? Maybe I should wonder what exactly are in those resources? Maybe I should just shut up and crawl under a rock?

Yet another update: I used 0xED to look into the file and the fork. The fork is full of Adobe info, since I used Photoshop CS4 to convert from a BMP to PNG. And, obviously, when uploading the image using Transmit, that fork is stripped off. Well, now I know that Photoshop saves a load of info in a resource fork, possibly including  info I don’t want them to save. Can’t see any obvious way of excluding that in the Photoshop save dialog box. So take care when passing on images to others that you strip off the resource fork first. Somehow.

Update about “Somehow”, this is how to do it: create an empty file, copy it over the resource fork, then delete the empty file. Like so, in terminal:

588