rmd: (fightclubanimated)
[personal profile] rmd
so, i've been having some problems with the world's most troublesome 10M connection

i found the problem. after lots and lots and lots of testing.

The patch cable was bad.
So was the one we tried replacing it with.
So was the one we tried replacing that one with.
So was the one we tried replacing the third one with.
So was the one we tried replacing the fourth one with.
So was the one we tried replacing the fifth one with.
So was the one we tried replacing the sixth one with.

at that point, they didn't have any other ST/LC cables that were long enough for this patch.

SEVEN. SEVEN BAD CABLES.

most of which were still in their factory-sealed bags at the time.

the internet told me that microcenter had two in stock. i went there tonite, bought an ST/LC MMF cable, brought it to the data center, and the tech installed it.

I have my 10M link.

and now, i do believe i'm going to go home and have a drink. or possibly seven. one to toast each of the bad cables.

Date: 2008-11-12 02:45 am (UTC)
From: [identity profile] deguspice.livejournal.com
That's just evil.

Care to share the brand name? (not that I have a need for cables like that)

Date: 2008-11-12 02:56 am (UTC)
From: [identity profile] rmd.livejournal.com
i don't recall, actually. the couple of times i was reading the label card on one of the packages, i was checking that it really was multimode fiber.

Date: 2008-11-12 08:23 am (UTC)
From: [identity profile] catness.livejournal.com
W.
T.
F.

I'm not sure I would have tried 7 times. Sheesh.

Date: 2008-11-12 02:47 am (UTC)
cz_unit: (Default)
From: [personal profile] cz_unit
Heh. One time in the long past we put a 1.5 million dollar video system in the auditorium. Prizm system with a spider back end, does HD video and things that would make any TF person puke with joy.

One of the six video planes in the mixer was flipping to pink every once in awhile. Techs had no clue, I told them to check the crimps. Major presentation, and during so the screen went pink. Techs said it was programming: "If it was a crimp the video would drop if I did this" as he wiggled a cable.

We screamed NO! Video went away. Tech 2 tried to rertoute to plane 4, however he had a football game going on plane 4's monitor. So our audience was treated to the power point going out, then a football game for 5 seconds, then the presentation.

Needless to say they fixed the crimp.

CZ

Date: 2008-11-12 02:56 am (UTC)

Date: 2008-11-12 03:40 am (UTC)
muffyjo: (Default)
From: [personal profile] muffyjo
Wow. I bow to your superior tenacity. That's amazing. Drink a bonus drink, because you deserve it and won't get one from the company.

Date: 2008-11-12 04:40 am (UTC)
From: [identity profile] unclebooboo.livejournal.com
I have to wonder whether there was something wrong (a weird incompatibility) in the design of the patch cables that you were using so that when you switched to another manufacturer's cable it started working.

This reminds me of a story that only readers of this posting would be likely to appreciate. I once spent a couple of months debugging a problem with a Codex stat mux that would crash once a week or so at one customer site in Italy. The field service guy recognized that the PTT had provided a very noisy circuit, so we started by plugging in a noise generator to see if we could recreate the problem. Sure enough, it was the line noise that provoked the crash, but we still didn't know why the box was crashing. Watching on a protocol analyzer, I figured out that the crash happened about once for every 60,000 bad packets! I woke up in the middle of the night with the solution. Wanna guess?

Date: 2008-11-12 08:55 am (UTC)
From: [identity profile] paradoox.livejournal.com
Increase the bad packet count from a 16 bit number? Don't crash when you overflow the bad packet counter? What?

Date: 2008-11-12 01:29 pm (UTC)
From: [identity profile] unclebooboo.livejournal.com
The link level CRC was a 16 bit bit checksum. With only a 16 bit CRC, every time a packet gets corrupted with noise you've got about a 1 in
65536 chance that the bad packet will pass the CRC test.

By using a logic analyzer to trigger the "stop" on the protocol analyzer, I was able to capture the moment. The last packet received before the crash was always a packet with a "good" CRC, but its actual contents were garbled.

The protocol used by the stat mux assumed that the link level CRC would be adequate so it didn't do any further error checking on the contents of the packet. Since the product was on its last legs, my boss approved the kludge of adding an extra one byte checksum inside the link level packet- this turned the once a week crash into a once every five years crash.

Now, the newer link level protocols all use 32 bit CRC's. Furthermore, TCP (which was developed in the bad old days of 16 bit CRC's) includes its own checksum.

Date: 2008-11-12 01:35 pm (UTC)
From: [identity profile] unclebooboo.livejournal.com
Oh yeah- this was all happening at the blazing fast speed of 19,200 bits per second. I think that puts Regis's slow 10Mbps connection in perspective...

Date: 2008-11-12 11:07 am (UTC)
From: [identity profile] rmd.livejournal.com
well, the patch cables the data center folks bought were all bought at the same time, so i'm guessing it was a bad batch. the design was a bit odd, though, and certainly looks kind of fragile.

the ST fiber connection has a bayonet lock like a coax connection, with the barrel connector that you have to twist to lock down into place.

the design of the failing cables was that the exterior part of the connector just turned, and the interior (the fiber) was on some kind of spring-loaded thing and it pushed down into the connector when you secured it. which is kind of a questionable design choice, i'd say.

the one i got had the fiber stationary and the exterior of the connector pulled forward with spring resistance without changing anything about the fiber.

as for the 60K packets, was it an internal counter?

Date: 2008-11-12 06:41 am (UTC)
From: [identity profile] lioritgioret.livejournal.com
Yer vendor has a wee QA prob.

Date: 2008-11-12 02:07 pm (UTC)
From: [identity profile] i-leonardo.livejournal.com
sounds like all the justification you need to get your overlords to purchase a fiber cable tester.

Date: 2008-11-12 02:10 pm (UTC)
From: [identity profile] rmd.livejournal.com
except it wasn't my cable!

the data center guys had a cable tester, but it didn't have a female ST connector on it, so they couldn't test the patch cable.

i pushed them to do at least some troubleshooting with a laser pointer. i think that's around when they started swapping in every cable they had.

Date: 2008-11-12 03:25 pm (UTC)
drwex: (Default)
From: [personal profile] drwex
And that one burned, fell down, and sank in the swamp.

But THE NEXT ONE STAYED OOP!

Date: 2008-11-13 01:23 am (UTC)
From: [identity profile] evwhore.livejournal.com
Wow. Full marks for perseverance!

Date: 2008-11-14 06:55 am (UTC)
From: [identity profile] madbodger.livejournal.com
Wow. You earned those drinks! Do you lop off the ends of b0rked cables like I do, to minimize their tendency to sneak back to re-annoy you?

Profile

rmd: (Default)
rmd

June 2025

S M T W T F S
1234567
89 1011121314
15161718192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 4th, 2025 01:50 pm
Powered by Dreamwidth Studios