Stata destring with removing weird characters

STATA destring with only keeping numeric numbers

The newest solution to this problem is:
egen-method as described below followed by:

replace problemvar = subinstr(problemvar, ",", ".",.)
https://www.stata.com/statalist/archive/2009-05/msg00862.html

followed by:
destring problemvar, replace *without the 'dpcomma' option.


Dieter Van Der Westhuizen
Mon 2020-07-27 03:01 PM
Sent Items
To:Jody Rusch (jody.rusch@nhls.ac.za);
Follow-up from my previous email:

*This command works when one has either comma or period in the field:

destring kknew3, replace dpcomma 
Dieter Van Der Westhuizen
Mon 2020-07-27 02:39 PM
Sent Items
How to strip special characters in STATA variables:
Type the following:
***********************************
program define extrnum
version 7
syntax varlist(max=1) , gen(str)
local maxlen: type `varlist'
local maxlen=substr("`maxlen'",4,.)
tempvar work
qui gen str1 `work'=""
forvalues i=1/`maxlen' {
qui replace `work'=`work'+substr(`varlist',`i',1) if real(substr(`varlist',`i',1))<.
}
gen `gen'=real(`work')
end
*************************************
How to use:
extrnum var1, gen(newvar1)
 
It works perfectly except it drops the commas too:
**Thus use:
findit egenmore
**and install the app “egenmore” by following the instructions. 
**Then one can specify to keep the following characters:
egen kknew3 = sieve(kk), char(0123456789.,) 
**Which lets you end up with this:

Destringing this doesn’t work yet, as some characters have “.” and others have “,”

Dieter Van Der Westhuizen
Mon 2020-07-27 11:30 AM
Sent Items

https://www.statalist.org/forums/forum/general-stata-discussion/general/967675-removing-non-numeric-characters-from-strings

Dieter van der Westhuizen
Chemical Pathology RegistrarC17 NHLS Pathology Laboratory, Groote Schuur Hospital and Red Cross Children’s Hospital LaboratoryTel: 021 404 4135 | Cell: +27 82 861 2093 | Fax-to-email: 0866 090 397dieter.vdwesthuizen@nhls.ac.za | www.nhls.ac.za




Section 7.8 – COVID OTL Dashboard

A dashboard was created to visually represent the COVID outstanding test list for our province for COVID outstanding PCR tests. This dashboard has been requested by the area manager of the Western Cape and various others on the Virology Expert Committee. Even though not a task primarily assigned to chemical pathologists, since I have an interest in data science, I tried to help. The end result was a dashboard which is updated every morning at 06h00 and every evening at 18h00 with a few JavaScript scripts running each day, updating three databases on the backend along with automated data extractions being done from TrakCare every morning and evening at a predefined time.

Screenshot 1 – Illustration of the JavaScript code to read the extracted Excel (or CSV) files and transcribing them to a Google Sheets database.

This dashboard was used (and likely still are being used) especially by the virologists at Groote Schuur Hospital to track the progress of outstanding COVID PCR tests and it can also be used to show possible bottlenecks in pre-analytical sample issues if tests are already registered before being sent to any of our laboratories in the Western Cape.

The dashboard consists of a few pages:

https://datastudio.google.com/embed/reporting/aeafe888-d10f-4959-b1f3-928556abd6f6/page/LMPXB

Figure 1 – Screenshot of the Dashboard. Follow the link above to view.
Figure 2 – Example of the layout of the page which shows the total count of items on the respective outstanding test lists for each respective data for the Western Cape.
Figure 3 – Outstanding tests summary by location. This page is especially helpful if the delay / outstanding tests from a specific hospital or clinic needs to be visualized in comparison with other locations (hospitals / clinics).
Figure 4 – Illustration of the turnaround time met by a certain count of samples per each respective user site for one day.



Immune Responses to SARS-CoV-2 cause severe COVID-19 in some and recovery in most

Clive Gray – Professor in Immunology
immunopaedia.org – useful web site for immunology resources.

Outline:

  • Basics of Immunology
  • A balance between inflammation and tolerance
  • What happens to people who progress to severe COVID-19?
  • What might be happening in SARS-CoV-2 infected people who remain asymtomatic, have few symptoms and recover?

Internal : External world

~99% of time the pathogen gets destructed, but the pathogen may survive in rare cases.

2 arms of immune responses:

Innate – evolutionary response – very rapid – elements of innate immunity are found in bacteria, plants, lower vertebrates, squids, fish etc.

Some pathogens survive ->

Adaptive immunity – much more targeted / focussed. The immune system targets more specifically the pathogens which survive the innate immunity.

Infection initially -> expansion (peak after maximal viral load) -> contraction with some residual immunity (Memo Maintenance) -> with secondary response (Recall) there is a more rapid expansion (and higher peak) of the specific immunity.

Immune regulation:

Predisposed conditions: DM, HPT, Obesity, would make an individual highly susceptible to inflammation due to in imbalance of Inflammation vs. Tolerance, see below.

The Yin Yang of immunology:

Yin – immune regulation; Yang – Inflammatory Process

Pro-inflammatory (Orange)

TH17 – inflammatory cells secreting the “calling signals” for leucocytes.

TH1 – secreting IFN gamma; IL-2 ; TNF-alpha – cytokines causing inflammation

Macrophage – presents antigens – in lymph nodes and germinal centres

T-Helper cells T-FH

Immune Regulation (Blue)

TH2 – hand in hand with TH1 (opposite)

Regulatory cells (nT and iT regulatory cells)

Actual pathogen is not causing disease – but the immune response – thus this is what should be focussed on to treat the disease.

Dose of the virus (viral load) is key to how you respond to the virus – Initial High dose in viral load likely will lead to high inflammatory response; Low dose (non-robust virus replication) may cause a less severe inflammatory response.

CCL’s allow leucocytes to migrate, hence in a cytokine storm, with high level of migration, the leucocytes causes severe local inflammation due to migration of leucocytes to local sites.

CCL5 blocking antibodies leads to rapid reduction of IL-6.

Dexamethasone is not so much an inhibitor of CCR5, but it prevents the hyperinflammation by inhibiting the majority of the inflammatory pathway.

CD8 cells

Within interstisium, the CD8 cells are present and

CD4 cells activates CD8 cells, hence called T-helper cells.

T-cell responses are very prevalent to COVID-19 exposed individuals. BUT CD4 cells and CD8 cells can also react to the SARS2 viral proteins in unexposed individuals.
RBD – receptor binding domain (Spike-protein); Orange is the amount of amino acids which are changeable.
Orange amino acids are those which are changeable – illustrating how the virus has mutated in a month.



A pepper-pot skull?

HOSP # WARD General Practitioner Practice in Robertson
CONSULTANT   Dr. Jody Rusch DOB/AGE 83 year Male

Abnormal Result

Serum protein electrophoresis demonstrates a 4.4 g/L, IgG kappa monoclonal peak in the gamma region.

Presenting Complaint

Complains of bilateral hip pain and RUQ discomfort.

History

Atrial fibrillation on Xarelto. 

2 x CABG 

Examination

RUQ pain and tenderness

Hear rate regular

Laboratory Investigations

Urine protein electrophoresis: No Bence Jones protein

Serum free light chains:·         

  • Kappa 62.87 mg/L (3.30-19.40)·         
  • Lambda 19.63 (5.71-26.30)·         
  • K:L ratio 3.20 (0.26-1.65) 
  • Creatinine 108 (eGFR 56)
  • Calcium 2.42 mmol/l
  • Albumin 40 g/L
  • Hb 12.7 (11.0-16.0) 

Other Investigations

U/S shows gallstones.

X-Ray of pelvis shows “sclerotic changes to both hips and pelvis”

Final Diagnosis and Take Home Message:

1. What is the likely diagnosis

This 83 year old male with multiple co-morbidities presenting with signs and symptoms suggestive of multiple myeloma, confirmed on SPE as IgG Kappa.

  • CRAB criteria before performing SPE: C- R+ A- B+ (2/4)
  • Bony pain could be secondary to lytic bone lesions associated with MM, but also possibly to due sclerotic/ wear-and-tear when considering his age. RUQ pain is likely due to gallstones.
  • Renal impairment – this is probably normal renal function for an 83 year old man
  • In medicine generally an eGFR < 60 is representing renal impairment (stage 3)
  • However, in monoclonal disease eGFR < 40 or serum Cr  > 177  is the cut-point       
  • Bone lesions – myeloma classically causes lytic bone lesions, e.g. “pepper-pot skull”

It was suggested that the clinician talks to the radiologist as to whether the X-Rays were in keeping with myeloma.

2.       Critically discuss whether this patient needs a bone marrow biopsy.        

The patient’s age along with co-morbidities would concern any drastic intervention:

  • he will be an anesthetic risk for BM Bx to be performed in theatre (assuming that is standard procedure), and
  • will the BM biopsy give add anything further to the already established IgG Kappa diagnosis, which can be treated accordingly.

The case should ideally be discussed with Oncology. A bone marrow biopsy is done under local anaesthetic.  The bone marrow will allow the haematologist / oncologist to assess the degree of marrow clonal infiltration.  The important cut-offs are 10 & 60%.  This is important to decide on diagnosis, stage, prognosis, treatment and later, the response to treatment.  The criteria for doing a bone marrow biopsy at our centre are:         

  • Positive CRAB·         
  • IgG monoclonal peak > 15 g/L·         
  • IgM or IgA monoclonal·         
  • FLC K:L > 10

Why is there a lower (10%) limit for degree of marrow clonal infiltration? Is there a link to immunoparesis? One likely always has some clonal expansion in bone marrow,  probably a normal or a non-pathological finding. 

3.       Discuss the serum FLC in the setting of the renal impairment.        

FLC are filtered and reabsorbed by the nephron under normal circumstances, along with other LMW proteins. During a plasma cell dyscrasia, the nephron is overwhelmed by the amount of FLC (stemming from monoclonal origin), can cause renal impairment. Hence, renal function being part of the CRAB criteria. Furthermore, renal impairment itself (in the absence of MM), can cause elevated Kappa and Lambda FLC – usually with a slight higher ratio =3.2.

In patients with renal failure, there is greater retention of serum free light chains. It is difficult to interpret ratios ranging between 1.65 – 3.0 in the context of renal insufficiency. In such cases, further investigation with a 24-hour urine protein electrophoresis and urine immunofixation helps to guide interpretation. If both of these subsequent studies are normal and the patient has no other symptoms suggestive of a plasma cell dyscrasia, then the increased ratio is likely due to the renal insufficiency. 

4.       Discuss electrophoresis briefly.        

Electrophoresis is a general term that describes the separation of charged particles/ ions under the influence of an electric field – in this case the charge of proteins. Migration of proteins is based on their charge, size and velocity (product of their mobility and field strength) Make sure you understand why the proteins are charged  the importance of NET charge and how we keep those charges stable in the field. If I can take a crack at this: The overall NET charge of the molecule is based on the number of elements (incl. amino acids with varying side-chains moeities) (I think this is the confusion when some mention that electrophoresis is based on charge, and also size. I don’t necessarily think that the two are synonymous), and each amino acid has different degrees of charge based on their differing R-group. The stability of the charges within the field is achieved by running the sample solution through a buffer. Right about the buffer.  Remember that size and charge are two different physical aspects that you can use to separate molecules.  For example, a DNA gel is a separation purely based on size.  The net charge is the same on all the molecules.  The net charge in proteins is from the side chains, which is why you have to learn about neutral, acidic and basic amino acids.  The side chains have different pKa’s and so are charged differently. 

a.       What is the difference between capillary and gel electrophoresis.  Explain which your lab uses and why.

What I described in Q4 was basically the concept of gel electrophoresis where agarose gel is used as the medium in which the proteins are separated according to their size, charge, and interaction with the medium itself. At TBH we use gel electrophoresis, but will soon be getting a Minicap/ CZE. CZE: As with gel electrophoresis, CZE also separates ions based on their electrophoretic mobility with the use of an applied voltage – all dependent on the charge of the molecule, viscosity and particle size. CZE’s voltage is much higher compared to GE – quicker results. The buffer/ mobile phase of CZE uses an electrolyte filled capillary, where eletro-osmotic flow (EOF) is generated: similarly sized and charged ions move together and are subsequently separated and detected at different time intervals.The more voltage you apply the faster the separation occurs.  However, the limiting factor is that applying high voltages generates a lot of heat which can denature proteins, thereby altering their conformational shape and changing their NET charge.  Capillaries are much more effective at shedding heat as they are long and thin.  Thus, very high voltages can be applied and the run time is much shorter.  In gel electrophoresis, you measure how far the molecules travel in a set time, e.g. 1 hour.  In capillary electrophoresis, the distance that the molecules move is set and so you measure the time it takes for the molecules to travel that set distance (like running a 100m race).

The way I reconcile how the CZE differential seperation works is by the

  1. driving force of the buffer through the tubing (forward force)
  2. negative charge on the side of the tube (retarding force) 
  3. NET charge on the molecule (many amino acids=higher charge, eg Albumin) (determine degree of retardation of flow)
  4. Voltage powers the whole system 

5.       Why is the serum FLC abnormal but not the urine protein electrophoresis?        

UPE’s sensitivity is limited due to the reabsorption of FLC in the renal tubules. FLC in urine will only be detected until loss of tubular function/ tubules are overwhelmed by FLC volumes. This patient’s Kappa FLC of 63 mg/L in serum should be detected on UPE, but tubular function is seemingly still intact with little being excreted.

Some are of the opinion that SPE and SFLC is the preferred method to screen for myeloma because of higher sensitivity and specificity, as opposed to SPE and UPE, which may have a slightly lower sensitivity.

It should however be noted that quoted sensitivities and specificities are usually based on retrospective audits of patients who eventually end up in a myeloma clinic.  So, it is not sure what the sensitivity and specificity is if you just screen the general population, older people, people with some vague symptoms… 

6.       Against which epitopes are the FLC assay directed?        

The FLC epitopes are located between the interface between the light and heavy chains and are “hidden” – when bound to Ig, they will not be detected. Only when these epitopes are “free”, can they be detected, hence free light chains. They are directed at 2 hidden epitopes.

7.       Why is the FLC assay polyclonal and not monoclonal?

The biggest decider many times is COST, but lets put that aside for now.

It appears that polyclonal assays are more robust and have higher yields in product during testing and easier to make. They are unfortunately less specific, but this is not the most critical when one wants to measure the FLC broadly, instead of particlularly specific sites.

Epitopes are three dimensional shapes that the antibody binds to.  This is determined by the amino acid sequence. One drawback with polyclonal assays is that lot to lot will vary.  The difficulty is to maintain consistency in further production and / or distrubution of the antibody – it is not a simple process to ensure consistency.

8.       Describe how a monoclonal antibody is made for use in an assay.

Inject a rabbit (or other animal) with the protein of choice.  In three weeks, the rabbit will have produced antibodies to the protein.  The rabbit is sacrificed (killed) and the spleen harvested.  The spleen is ground up and the cells are put in a culture with a certain myeloma cell line.  The culture medium contains colchicine that induces the rabbit cells and the myeloma cells to fuse.  It also contains HAT medium: hypoxanthine, aminopterin and thymidine.  This specific myeloma cell line cannot recycle thymidine in the presence of hypoxanthine. 

So in the culture there are now three cell lines. 

  • Firstly, there are the rabbit cells that haven’t fused; these will die because they are not immortal. 
  • Secondly, there is the myeloma cell line, this will also die because of the recycling problem. 
  • Finally, there is a fused cell line that will survive . 

Each of these surviving cell lines will produce one Ig against one part of the protein.  Now the researchers take the medium and put a tiny amount into a well.  The amount is so small that on average each well will contain only one cell; some will of course contain nothing.  Then, each well is targeted against the protein and the most promising ones are investigated further.  An immortal Ig producing factory directed against one epitope and based on one cell line, a single clone, or as we’d call it a monoclonal, has been produced.  Each manufacturer’s produced immunoglobulin is different and may produce better or worse results. 

9.       The GP, in Robertson, wants some advice on how to proceed.  What do you tell him?

A multidisciplinary approach would be best:

  • Treatment for the lytic bone lesions (after opinion by radiology): bisphosphonate
  • Assess overall medication and lifestyle to determine overall risk for worsening renal dysfunction (drugs, co-morbidities……always suggest stopping smoking/drinking)
  • Prevention of thrombotic/infective episodes
  • Treatment of any further abnormalities should they arise (hypercalcaemia, anaemia etc.)
  • Specialist referral:
    • Haematoncologist for treatment of MM: UPE , Bone marrow biopsy
    • General surgery for gallstone

10. Is there any relevance for the RUQ/gallstone pain in myeloma specifically?

There are some reports where cholecystitis has presented in MM (mets etc), but it is not a separate entity on its own (such as in POEMS), this is simply the real world where elderly patients have more than one pathology.