• 0 Posts
  • 9 Comments
Joined 2 months ago
cake
Cake day: December 9th, 2024

help-circle

  • He’s complaining that a number isn’t unique and is being poorly used, but the number isn’t supposed to be unique and he’s complaining that it’s not being used in a way that experts are specifically warned not to use it in.

    But on a second, stupider layer, this is the system those numbers originate from. So however they use them is how they’re supposed to be used.

    But then, back above that first stupid layer, on an even more basic and surface level degree of stupid, the government definitely uses SQL databases. It uses just… so many of them.


  • It’s wild too. I’ve been in the hospital a lot lately and in addition to a bar-code wristband, every healthcare worker, before doing anything with me (the patient) will ask my full name and either birthday or address and then double-check it against the wrist band. This is to make sure, at every step, that they didn’t accidentally swap in some other patient with the same name. (Not so uncommon, lots of men have their father’s name.)

    Meanwhile in like Iceland, everyone gets assigned a personal GPG key at birth so you can just present you public cert as identification, not to mention send private messages and secure your state-assigned crypto-wallet. Not saying such a system is without flaw but it seems a lot better than what we’re doing!


  • This is a good summary. I had to go pull up wikipedia on it since I roughly knew that social security was a national insurance/pension kind of system but am actually hazy on details.

    The major issue with it as id (aside from DBA’s gripes about it) is that credit agencies and banks started to rely on it for credit scores and loans. You see, the US has a social scoring system (what we always accuse China of) but the only thing it tracks is how reliable you are about paying off debts. So with your home address, name, and SSN, basically anyone can take out loans or credit cards in your name. This will then damage your credit score, making it harder to get loans, buy a home, rent property, or even get a job.

    That’s why Americans are always concerned about having our identity stolen: because you don’t need a lot of info to financially ruin someone’s life.


  • I’m hardly the king of databases, but always using a surrogate key (either an auto-incremented integet or a random uuid) has done me pretty well over the years. I had to engineer a combination of sequential timestamp with a hash extension as a key for one legacy system (keys had to be unique but mostly sequential), and an append-only log store would have been a better choice than an RDBMS, but sometimes you make it work with what you have.

    Natural keys are almost always a bad idea though. SSNs aren’t natural, which is one pitfall: implicitly relying on someone else’s data practices by assuming their keys are natural. But also, nature is usually both more unique than you want (every snowflake is technically unique) and less than you’d hoped (all living things share quite a lot of DNA). Which means you end up relying on how good your taxonomy is for uniqueness. As opposed to surrogate keys, which you can assure the uniqueness of, by definition, for your needs.


  • I’m sure folks on here know this, but you know, there’s also that 10K a day that don’t so…

    What makes this especially funny, to me, is that SSN is the literal text book example (when I was in school anyway) of a “natural” key that you absolutely should never use as a primary key. It is often the representative example of the kinds of data that seems like it’d make a good key but will absolutely fuck you over if you do.

    SSN is not unique to a person. They get reused after death, and a person can have more than one in their lifetime (if your id is stolen and you arduously go about getting a new one). Edit: (See responses) It seems I’m misinformed about SSNs, apologies. I have heard from numerous sources that they are not unique to a person, but the specifics of how it happens are unknown to me.

    And they’re protected information due to all the financials that rely on them, so you don’t really want to store them at all (unless you’re the SSA, who would have guessed that’d ever come up though!?)

    It’s so stupid that it would be hilarious if people weren’t dying.



  • I’m not an expert so take anything I say with hearty skepticism as well. But yes, I think its possible that’s just part of its data. Presumably it was trained using a lot available Chinese documents, and possibly official Party documents include such statements often enough for it to internalize them as part of responses on related topics.

    It could also have been intentionally trained that way. It could be using a combination of methods. All these chatbots are censored in some ways, otherwise they could tell you how to make illegal things or plan illegal acts. I’ve also seen so many joke/fake DeepSeek outputs in the last 2 days that I’m taking any screenshots with extra salt.


  • “Reasoning” models like DeepSeek R1 or ChatGPT-o1 (I hate these naming conventions) work a little differently. Before responding, they do a preliminary inference round to generate a “chain of thought”, then feed it back into themselves along with the prompt and other context. By tuning this reasoning round, the output is improved by giving the model “more time to think.”

    In R1 (not sure about gpt), you can read this chain of thought as it’s generated, which feels like it’s giving you a peek inside it’s thoughts but I’m skeptical of that feeling. It isn’t really showing you anything secret, just running itself twice (very simplified). Perhaps some of it’s “cold start data” (as DS puts it) does include instructions like that but it could also be something it dreamed up from similar discussions in it’s training data.