Substack

Showing posts with label RCTs. Show all posts
Showing posts with label RCTs. Show all posts

Monday, July 28, 2025

Process discovery as a rationale for RCTs?

I have blogged on multiple occasions (see this paper), highlighting the point that the most important thing in development is not the WHAT to do, but the HOW of implementing what’s already known. 

Unfortunately, the mainstream international development discourse and funders are obsessed with the former, almost to the exclusion of the latter. The headline focus on ideas and innovations as opposed to state capabilities is only an illustration. 

Randomised Control Trials (RCTs) are a good example. It informs us about the headline efficacy of an intervention. But it tells us very little about the mechanics of implementing the intervention (in addition to the WHY of the intervention), arguably the most important reason why good ideas rarely translate to realised development outcomes. 

Consider Edtech. A government has fitted all classrooms with smart screens and established a computer laboratory with systems installed with personalised adaptive learning (PAL) software. How do we integrate digital media and its content with the physical classroom instruction across 20,000 schools, or even 100 schools?

There are several uncertain parts, even if the PAL software is mapped to the curriculum. How to effectively use the computer lab - number of hours/classes per week, sequencing of physical classroom and lab work, more than one child using a terminal, monitoring the child’s progress and appropriate follow-up, etc.? Similarly, how to effectively use the smart screen - pedagogy that toggles back and forth between using the blackboard and the digital content, which content to use where and when, how to deliver content effectively, and the general felicity of the teacher in intermediating the digital medium and its contents. See more here

The efficacy of these Edtech interventions depends on getting these details right. I’ll define this implementation challenge as one of process discovery. It is the mapping of the details of the processes that increases the likelihood of successful implementation. 

The process discovery maps provide the blueprints that can be used by frontline officials to implement the respective programs. They would be the default or Minimum Viable Product (MVP) for the implementation of programs. These process discovery maps can be simplified and made as user-friendly as possible to enable their practical utility. Interested officials will start with the MVP and improve it to suit their contexts and styles. 

The headline efficacy evaluation RCT of development interventions is an overrated, even wasteful, pursuit. Did anybody in the world of practice of education pedagogy doubt the efficacy of software like PAL, if implemented effectively, as to require an expensive and long-drawn RCT? The main outcome from these RCTs is the publication of some papers and the burnishing of academic credentials. 

The World Bank and other donors have now implemented tens of RCTs involving Edtech solutions. While there’s a rich library of evidence on the efficacy of Edtech solutions when implemented effectively in pilots, there’s very little by way of process discovery maps about the HOW of its implementation. 

The same could apply to the implementation of teaching at right level, youth skilling programs, maternal and child health tracking software applications, interventions to improve public health or nutrition, measures to increase effectiveness of health insurance, use of body cameras by police and hot sport policing, adoption of better management techniques by SMEs to improve productivity, use of Dashboards by officials to monitor and follow up on programs, provision of information to improve farm productivity, etc. 

Like with the Edtech example above, in all these cases, I’m not sure about the value proposition (to the practitioners in governments and other implementers) that comes with just an efficacy evaluation. Who would dispute that if done well, all of them would be efficacious, albeit in varying degrees? The challenge is to do well at scale

In these circumstances, if donors still wish to support RCTs, here is a suitable compromise. By all means, support the RCT, but use the opportunity to also prioritise the creation of process discovery maps. They derisk processes and can serve as a template to guide the scale implementation by large systems in business-as-usual environments. 

This can create the incentives for the emergence of process discovery maps along with efficacy evidence. The researchers get to publish papers, the governments get detailed implementation templates, and donors get both efficacy accountability and contribute to effective scaling. 

As a rough thumb rule for donors, how about mandating that every efficacy evaluation RCT also have a process discovery map, wherever possible?

Tuesday, April 1, 2025

Assessing the health of school education systems in India

I have blogged on the issue of improving student learning outcomes numerous times. This post discusses a simple proposal for periodic measurement of learning outcomes and using it as a high-level decision support. 

Poor student learning outcomes should count as one of the biggest governance failures of India. An important contributor is the near complete absence of any institutional mechanism at higher levels to assess the quality of school education being delivered and the progress made in the achievement of basic literacy and numeracy or grade-appropriate learning outcomes. This diffuses accountability across the system at all levels. 

Any meaningful effort, therefore, to improve school education must start with a mechanism to continuously and credibly measure and track the aggregate learning outcome trends over time. This would serve as a health check on the general direction and magnitude of progress being made on improving student learning outcomes and also help compare across administrative units and geographies. It would strengthen accountability. 

A census in the form of standardised tests would be great but is impractical to do across grades and states. It’s for this reason that standardised tests globally are done at only a few checkpoints. 

Another option would be to undertake a rigorously sampled survey across grades and states. Surveys would be administered periodically to a cluster random sample of schools in each district. The baseline survey would be followed with an end-line recurring each year. This time series so generated would help assess subject-wise performance across districts and states. 

The National Assessment Survey (NAS) could be revised accordingly to serve this purpose. Alternatively, or perhaps more appropriately, the surveys could be administered by independent private assessment firms. The Government of India (GoI) could empanel a few testing firms for 3-5 years and allocate them across states. While the state governments would manage the contract, GoI would make payments to agencies. This would prevent conflicts of interest, align incentives, and maintain the rigour of the testing processes and the credibility of the outcomes. 

The firms would undertake evaluations based on some uniform and consistently applied set of principles issued by the Ministry of Education (MoE), GoI. The instruments, while different, must be standardised to test certain competencies. All data could be stored in a portal that would allow for easy longitudinal comparison of performance across administrative units and subjects. This data could also be analysed by independent researchers and provide actionable insights for policymakers and implementors than those generated by the countless randomised control trials done on this topic over the years. 

Any state government interested in expanding the scope of the survey would have the flexibility to do so. Some might want to compare across blocks or clusters or add a half-yearly midline. The expanded scope could be incorporated into the contract with the testing agency. 

The testing processes could be improved at the margins based on the emerging feedback from the first couple of tests, and the contracts could be revised accordingly. With time, over say, 2-3 years, the testing processes would have stabilised and the outcomes would provide a reliable assessment of the direction and magnitude of progress made by districts and states on different subjects. The important insights would be the trends in progress across the parameters and comparison across neighbouring or similarly placed geographies and administrative units. 

There are several areas where such information would be valuable decision support for policy making and program implementation. Some examples. 

Education Departments at all levels of government could use the survey trends to assess the direction and magnitude of progress on outcomes. Most importantly, it would allow for recognising failures and taking action to address them. Those lagging on trends would become aware of their lag and be forced to introspect on the reasons. Such introspection is critical to any meaningful effort to improve the quality of education. 

A bane of the education ecosystem is the absence of any credible system to reliably monitor outcomes. In its absence, questionable and bad ideas endure. The massive expenditures being incurred by state and central governments on Edtech despite very little evidence of its efficacy in improving learning outcomes is a case in point. More of the same without course corrections is most likely in the absence of any credible outcomes measurement system. A survey-based longitudinal learning outcomes tracking system would audit these interventions, help recognise failings, and allow officials to make the case to pull back on some of these questionable interventions. 

The MoE could consider incentivising states and districts by allocating a part of the Samagra Siksha Abhiyaan budget for the realisation of learning outcomes as measured from the aforementioned Survey. This would make outcomes-based financing of school education a reality. To make this sufficiently attractive, the allocations under this component can be made significantly large. This financing strategy could be complemented by competitions among districts and states on learning outcomes realisation.

This monitoring system should be supplemented with measures outlined here and elsewhere on interventions that are proximate to improving the quality of classroom instruction (as against those general inputs in the education objective function).

Monday, June 6, 2022

Evidence and evaluations in development

One of the most popular narratives in the development world is the idea of evidence-based policy making. Its most salient application is the idea of independent and rigorous evaluation of causality. Is this intervention creating the intended effect?

The development academia and commentators debate about the methodological and other challenges to evaluations and lament the lack of awareness and interest among bureaucrats and politicians. However, there are three very important points to be kept in mind.

One, when academicians, commentators and donors talk about evaluations, they are generally talking about evaluations of small experimental programs or pilots. Pretty much all the literature about "rigorous impact evaluations" involve these. They are rarely talking about evaluating ongoing scaled up programs. This is despite arguably more than 95% of the development world (in terms of budgets, numbers of programs, number of people working etc) being engaged with ongoing large scaled-up programs. 

But there are serious methodological limitations to doing any meaningful enough causal evaluations of such scaled up programs. How do we evaluate an intervention that seeks to improve student learning outcomes  or improve public health or increase nutrition levels or enhance women's empowerment or equip youth with employability skills by isolating the countless known and unknown determinants? How do we insulate the fundamental efficacy of the idea itself from its implementation at scale by a weakly capacitated state? 

Second, it brings us to the point about new ideas. The assumption is that development world needs to embrace the idea of innovation and technology that has come to characterise the private sector. However, instead of blindly adopting them, they need to be evaluated for their efficacy and then implemented. 

But, as I have blogged earlier, apart from procedural tweaks and technological solutions, there are very few new ideas per se in the field of development that are not known to practitioners at large. In the last thirty years, I cannot recollect having seen new development programs with meaningful enough impact on persistent development problems which have emerged as mainstream and have been transformative. In terms of primary interventions in the major development sectors, the choice set of interventions before governments is to only marginally tweak the ongoing programs or adapt interventions which have been effective in developed countries. There are no great unknown or untried ideas which can help leapfrog universal and persistent development problems! 

The paths to improving student learning outcomes, delivering better primary health care, skilling youth, empowering women, improving nutrition levels etc are and will about bringing together certain inputs and combining them effectively. The challenge has been about combining them. While process tweaks, private participation, and technology are useful, effective implementation at scale is mainly about political economy, stakeholder demand, contextual norms, and state capability. 

Third, even if there are such unknown or untried ideas with significant likely impact, their challenge is with getting implementation right. For example, we can think of ideas like self-help groups and microfinance, community health workers, short-term skilling programs, independent quality and social audits, public private partnerships, teaching to the child, and e-governance solutions that have emerged as mainstream in the last three decades. 

But their causal evaluation in a pilot can offer limited insights and guarantees about scale effectiveness. There is the point made earlier about the implementation challenge. It should be borne in mind that even after hundreds of RCTs about microfinance, we are still no more certain about the headline issues. 

The importance of impact evaluations in development discourse is therefore vastly exaggerated. More than any serious evidence, as I have blogged earlier, the impact evaluation movement in international development has been propped up by philanthropic foundations who've been concerned with the purely reductionist approach of finding the greatest value for their small donations. 

Wednesday, November 17, 2021

Genaralizability problem with investment models

A common topic in this blog has been about the problems with drawing inferences with field experiments in development. Economists refer to the generalisability or external validity of the findings of field experiments. This post will examine the problem in Finance.

This has a more wider resonance in the form of what has been called a "replication crisis" even in the world of hard sciences. The seminal work has been that of John Ioannidis who has shown that the results of many medical research papers could not be replicated by other researchers, a trend exposed by researchers in other fields too. I have blogged here about the work Eva Vivalt whose meta analysis of nearly 600 impact evaluation studies of 20 development interventions shows a remarkable level of non-replicability and inconclusiveness. 

This has an even wider relevance in the context of the debate around the narrative of superior wisdom of experts. The Covid 19 has been only the latest demonstration of the perils of relying on expert wisdom. Philip Tetlock has done extensive research to show how experts perform even worse than outsiders in predicting future events in their respective fields. 

Robin Wigglesworth has a good article in FT which points to the problem in Finance, citing the work of renowned finance guru and Duke University Professor Campbell Harvey. Before getting to Harvey's work,  let's read Wigglesworth's excellent description of the problem, 

The heart of the issue is a phenomenon that researchers call “p-hacking”... P-hacking is when researchers overtly or subconsciously twist the data to find a superficially compelling but ultimately spurious relationship between variables. It can be done by cherry-picking what metrics to measure, or subtly changing the time period used. Just because something is narrowly statistically significant, does not mean it is actually meaningful. A trading strategy that looks golden on paper might turn up nothing but lumps of coal when actually implemented. Harvey attributes the scourge of p-hacking to incentives in academia. Getting a paper with a sensational finding published in a prestigious journal can earn an ambitious young professor the ultimate prize — tenure. Wasting months of work on a theory that does not hold up to scrutiny would frustrate anyone. It is therefore tempting to torture the data until it yields something interesting, even if other researchers are later unable to duplicate the results.

Campbell Harvey claims a replication crisis with market beating investment strategies identified in top financial journals. He feels at least half of them are bogus and that his fellow academics are in denial on this. The paper is a short 6 page read. It starts with the provocative statement,

About 90% of the articles published in academic journals in the field of finance provide evidence in “support” of the hypothesis being tested. Indeed, my research shows that over 400 factors (strategies that are supposed to beat the market) have been published in top journals. How is that possible? Finding alpha is very difficult.

He then outlines the incentive structure facing researchers and journals,

Academic journals compete with impact factors, which measure the number of times an article in a particular journal is cited by others. Research with a “positive” result (evidence supportive of the hypothesis being tested) garners far more citations than a paper with non-results. Authors need to publish to be promoted (and tenured) and to be paid more. They realize they need to deliver positive results.

To obtain positive outcomes, researchers often resort to extensive data mining. While in principle nothing is wrong with data mining if done in a highly disciplined way, often it is not. Researchers frequently achieve statistical significance (or a low p-value) by making choices. For example, many variables might be considered and the best ones are cherry picked for reporting. Different sample starting dates might be considered to generate the highest level of significance. Certain influential episodes in the data, such as the global financial crisis or COVID- 19, might be censored because they diminish the strength of the results. More generally, a wide range of choices for excluding outliers is possible as well as different winsorization rules. Variables might be transformed—for example, log levels, volatility scaling, and so forth—to get the best possible fit. The estimation method used is also a choice. For example, a researcher might find that a weighted least squares model produces a “better” outcome than a regular regression.

These are just a sample of the possible choices researchers can make that all fall under the rubric of “p-hacking.” Many of these research practices qualify as research misconduct, but are hard for editors, peer reviewers, readers, and investors to detect. For example, if a researcher tries 100 variables and only reports the one that works, that is research misconduct. If a reader knew 100 variables were tried, they would also know that about five would appear to be “significant” purely by chance. Showing that a single variable works would not be viewed as a credible finding. 

His conclusion is what matters, 

The incentive problem, along with the misapplication of statistical methods, leads to the unfortunate conclusion that likely half of the empirical research findings in finance are likely false.
However, he feels that backtest-overfitted strategies (p-hacking) is less of a problem in asset management industry than in academia due to the need for replication and protect reputations in the former. 

Monday, August 2, 2021

The importance of roads - the academic debate

I have blogged earlier expressing my disappointment with researchers who question the economic value of investments in basic infrastructure like roads and electricity. These studies are often held up as evidence against channelling scarce resources, especially aid money, into infrastructure. Instead proponents argue in favour of investing in education, health, nutrition, skilling etc. Two samples below.

This paper on rural electrification program in Kenya finds,
We do not find meaningful medium-term impacts on economic, health, and educational outcomes nor evidence of spillovers to unconnected households. These results suggest that current efforts to increase residential electrification in rural Kenya may reduce social welfare.
This paper on India's rural roads construction program finds,
There are no major changes in consumption, assets or agricultural outcomes, and nonfarm employment in the village expands only slightly. Even with better market connections, remote areas may continue to lag in economic opportunities.

This argument is based on the methodological biases among academic researchers. George Akerlof had pointed to the same, drawing the distinction between hard (quantitative and rigorous) and soft (qualitative) methodologies to examine development phenomena. He said that the former, which is the current preferred approach, generates 'sins of omission' which leads to under-estimation of economic impacts. 

In this context, The Economist points to a new work by Ana Paula Franco et al which examined the long-term development impact of the 30,000 km long Inca Road which stretched across several countries and which was built to collect taxes and deploy troops across the 10 m sqkm empire. 

Image
To test if the Inca road, the Incas’ main thoroughfare, has boosted modern living standards, the authors split the map into small squares. For four indicators of welfare—wages, nutrition, maths-test scores and years of schooling—they compared levels from 2007-17 in squares crossed by the road with those in neighbouring squares not on its route. On every measure, residents of roadside squares fared better than those in adjacent ones, even after controlling for differences in such factors as the slope of terrain and the presence of rivers. Women gained more than men.

How did the road grant such long-lived blessings? The Spaniards used it to ship silver and turned the warehouses into profitmaking shops, often staffed by women (possibly inculcating more equal gender roles). This made land near the road unusually valuable, encouraging colonisers to settle there. The authors argue that Spaniards who moved in claimed legal title to their landholdings and built schools and new roads in the vicinity, creating enduring property rights and public goods. Today, the presence of the Inca road alone accounts for a third of the observed difference in levels of formal land ownership between dwellers on the road and those in nearby areas. It explains half the difference in the number of schools.

Image

Image

From the original paper itself,

We estimate the long-term effects of the Inca Road and find a positive and statistically significant relationship between the Inca Road and hourly wages, with residence within 20 km of the Inca Road increasing hourly wages by around 10.5% as compared to wages in other areas during the period 2007-2017. This effect is as large as that of an additional year of schooling. Along the same lines, we find a significant negative relationship between the Inca Road and child malnutrition, as attendance at a school located within 20 km of the Inca Road reduced the probability of being malnourished by 3.4 percentage points (a reduction of around 8%) in 2005. In addition, we find a positive effect between the Inca Road and educational outcomes, with residence within 20 km of the Inca Road increasing the length of schooling by 1.64 years, which equals 22% of the average years of schooling of our sample (7.41 years).

With respect to the channels of persistence of the effects of the Inca Road, there is a significant and positive association between the Inca Road and the current provision of two main public goods, primary schools and roads. We report that 20-km grid cells crossed by the Inca Road have an extra 18 km of road density, which equals 82% of the road density in our sample. Similarly, we find that grid cells crossed by the Inca Road also have an additional 48 primary schools, for an increase of 79% over the sample average. What is more, we find that there is a positive association between the Inca Road and current property rights. The size of this impact can be appreciated by taking into consideration the fact that only 26% of the members of our sample own their homes, as this makes the increase attributable to the Inca Road equivalent to 31%.

Finally, we find a positive association between the Inca Road and female labor outcomes. Women who live within 20 km of the Inca Road average 1.63 more years of schooling than women who live farther from the road system. This is a sizeable increase (22%) over our sample mean (7.4 years), and one which outpaces the increase for men. Furthermore, there is also a positive association between the Inca Road and different measures of female intra-household bargaining power. Residence within 20 km of the Inca Road reduces the likelihood of being a teenage mother by 1.4 percentage points. It also increases the likelihood that women are making health-related decisions for the household by 5.2 percentage points and the likelihood that they are making decisions about high-value purchases, such as buying a home, by 5.4 percentage points.

It would be great if someone could do a study of the Grand Trunk Road or numerous other major roads built by old Indian empires, like this study of the impact of the Indian Railways by Dave Donaldson.

This is yet more reminder, if they were needed, to the western academic world engaged in development about the importance of basic infrastructure like roads and electricity.

Update 1 (07.11.2021)

Santanu Chatterjee and co-authors find differential impacts on formal and informal firms from the development of national highways,

We find that formal firms that are located in districts along the planned route of the GQ/NS-EW corridor are, on average, 9-10 percent more productive than firms that are not on the planned route. By contrast, informal firms on the highway corridor are no more productive relative to their off-route counterparts. Further, we also find that formal firms located in districts 0-30 miles from an upgraded or completed section of the corridor produce 2-4 percent more for every additional year after the completion of the project. The corresponding benefit for informal firms is much smaller, at around 1 percent. Further, using quantile regressions we find that while the productivity benefits of public investment are spread evenly across the size distribution of formal sector firms, they are strictly increasing in firm size for the informal sector. Additionally, smaller informal firms are significantly disadvantaged by the highway upgrades: public investment benefits not only the larger firms in each sector, but also formal firms much more than informal ones.

As to possibly why this happens, they write,

We use quantile regressions to examine whether public investment has a differential impact along the size distribution of firms in each sector. Here, we find evidence that the complementarities generated by an increase in public investment lead to large firms crowding out the output of smaller firms, both within and across sectors: large firms in the informal sector tend to crowd out smaller firms within that sector and, formal sector firms also tend to crowd out small informal firms. Intuitively, large informal firms tend to have a higher capital intensity in production than their smaller counterparts, and formal sector firms also tend to have higher capital intensity than their informal counterparts over all. As such, public investment benefits not only larger firms in each sector, but also formal firms much more than informal ones. This can help explain why we are unable to find any positive and significant association between public investment and informal production for the average firm in our sample.

Friday, December 18, 2020

More on the evidence-generation industry in development economics

Lant Pritchett points to the example of this RCT which shows that "reducing proximity to schools increases enrollment for boys and girls, increased enrollment leads to increased learning and the effect was differentially larger for girls" and describes such evidence generation as "feigned ignorance". 

I have blogged here and here about a big problem with academic research and the evidence-based policy making movement in international development. One of marginalisation of priors and the emergence of evidence as an ideology.

Development tourists fly-in, observe a problem, imagine/invent a solution (read this), then try to generate evidence for the same so as to attract funders and scale the solution. Never mind, no such invention has ever reached scale.

As if not happy with such success, there is new direction of emerging research. Again the same tourists find something intriguing (sometimes it is stuff which is commonplace in their own countries, like this), formulate a theory of change or hypothesis, try to generate evidence to attract funding for scale-up. Never mind that the natives have been using the same for centuries or decades, and nobody there seriously dispute the hypothesis.

Both are held-up as examples of evidence-based policy making. It begs the question. Evidence for whom? And for what purpose? And the answers appear to be - for the outsider, and for research publication.  Or to meet the bureaucratic requirements (what is the evidence?) of the donor. Not for those living with the problem or the solution, nor for those practitioner trying to address the problem or scale-up the solution.

One exhibit, forwarded by a friend, is this paper which discovers that threat of disconnections of utility services is effective in enforcing bill payments, and that it is superior to soft-encouragement that merely informs tenants about their delinquency.
Public utilities afraid that service disconnections will have political consequences are reluctant to enforce payment with service cutoff. We test this hypothesis using a field experiment in the slums of Nairobi with two interventions intended to improve repayment for water and sewage services: a soft encouragement that informs tenants about landlord’s payment delinquency and, second, a hard threat of disconnection for nonpayment with enforcement if landlords do not pay. While we find no effect of the soft encouragement intervention, we find very large effects of the disconnection intervention on repayment. Moreover, there seems to be no effect on landlord and tenant perceptions of utility fairness or quality of service delivery, on community activism, on the relationships of tenants with their landlords, or on child health... These results suggest that strict enforcement through disconnections increases payment and the financial position of the utility without incurring political costs.
Did the effectiveness of disconnections and its superiority to soft encouragement really require any evidence at all? Also, can any experiment, howsoever rigorous, convince any practitioner that disconnections don't bring political costs? Leave aside the ethical concerns with "studying" disconnections. 

Another exhibit is this paper on footbridges. What was the need to evaluate the value of footbridges in remote areas? 

Sample these revelations,
Floods decrease labor market income by 18 percent when no bridge is present. Bridges eliminate this effect. The indirect effects on labor market choice, farm investment and profit, and savings are quantitatively important and consistent with the predictions of a general equilibrium model in which farm investment is risky and the labor market can be used to smooth shocks. Improved rural labor market integration increases rural incomes not just through higher wages, but also through these quantitatively important indirect channels.
If you go to the remote interiors anywhere in the world which has a forested terrain and is criss-crossed by rivulets and streams, one of the primary demands of villagers living in isolated small habitations are footbridges to cross the streams. In rainy seasons, when the streams are full, the villages get cut-off from the outside world for weeks/months, and the villagers suffer badly. 

Did we need evidence to show that footbridges are a useful thing? Is qualitative evidence (or self-evident realities) about the suffering of the people not enough to make the case for footbridges, and there is a need for a rigorous quasi-experimental study on labour incomes? Do we need evidence to show that "farm investment is risky and the labor market can be used to smooth shocks"? Or that rural market integration has "indirect channel effects"?

Isn't this all so plain obvious? Clearly not for the two development tourists who were the PIs in this paper.

It is the same naivety or self-centredness that drives people into wanting to test the efficacy of public spending on rural roads and rural electrification! Imagine if Eisenhower had researchers using the logic of value for money (from partial equilibrium analysis) to question building inter-state highway system (as against spending on welfare or even local roads).

Like someone from developing country demanding rigorous evidence to be convinced that Londoners, including the well-off, use public transport, or use bicycles, or normally buy breakfast from Prets (and not make at home).

In case of the footbridges paper, I guess the methodological neatness arising from the naturally available dataset explains the Econometrica publication. But its natural extension to the serious pursuit of international development is a travesty. 

Tuesday, November 10, 2020

RCTs in Development

I have an interview in this new OUP compilation that makes a critical assessment of the use of randomised control trials (RCTs) in the field of development. Available for pre-order here

I am also a discussant on a book launch event hosted by UNU-WIDER on November 10.  

Saturday, October 31, 2020

Weekend reading links

1. Large is not always good. Marc Levinson writes how the latest giant container ships have, instead of lowering transport costs and raising efficiency, has increased costs, reduced speeds, and created a host of other problems.

Discharging and reloading the vessel took longer as well, and not only because there were more boxes to put off and on. The new ships were much wider than their predecessors, so each of the giant shoreside cranes needed to reach a greater distance before picking up an inbound container and bringing it to the wharf, adding seconds to the average time required to move each box. Thousands more boxes multiplied by more handling time per box could add hours, or even days, to the average port call. Delays were legion... The land side of international logistics was scrambled as well. At the ports, it was feast or famine: Fewer vessels called, but each one moved more boxes off and on, leaving equipment and infrastructure either unused or overwhelmed. Mountains of boxes stuffed with imports and exports filled the patios at container terminals. The higher the stacks grew, the longer it took the stacker cranes to locate a particular box, remove it from the stack and place it aboard the transporter that would take it to be loaded aboard ship or to the rail yard or truck terminal for delivery to a customer. Freight railroads staggered under the heavy flow of boxes into and out of the ports. Where once an entire shipload of imports might be on its way to inland destinations within a day, now it could take two or three. Queues of diesel-belching trucks lined up at terminal gates, drivers unable to collect their loads because the ship lines had too few chassis on which to haul the arriving containers.

2. Gautam Bhan writes about the lop-sided nature of urban land distribution,

Despite the language of “encroachment” and widespread “land grab,” bastis (slums) are on a minute portion of city land — less than 0.6% of total land area, and 3.4% of residential land in the 2021 Delhi Master Plan. This tiny percentage supports no less than 11-15% but possibly up to 30% of the city’s population, most settled for decades. One example shows how skewed this number is. In 2017, parking Delhi’s 3.1 million cars used 13.25 sq km of land, or 5% of all residential area. Cars, then, have more space than the housing of workers, residents, and families.

3. Obituary in FT of Lee Kun-hee, Samsung's Chairman. Lee was a real business titan and a force behind South Korea's economic transformation.

Samsung, which pulled away from Hyundai to become the biggest of South Korea’s chaebol, or industrial groups, by a wide margin. The company is the largest maker of memory chips, smartphones and electronic displays, Samsung C&T built the world’s tallest building in Dubai and Samsung Heavy Industries is the world’s third-largest shipbuilder by sales. Other subsidiaries’ range from theme parks to insurance. It is for the transformation of Samsung Electronics, however, that Lee will be most remembered. Samsung was a minor player in the global technology industry when he took the helm in December 1987, succeeding Lee Byung-chull, his father and the group’s founder... Within five years, Samsung was the world’s biggest producer of memory chips underpinned by billions of dollars of annual investment, even during downturns. Despite this success, shoppers around the world continued to view Samsung’s consumer electronics as poorly designed and undesirable. Lee’s aggressive interventions to change this perception have now become legend. The most famous came in 1995, after the humiliation of finding that Samsung mobile phones he had given as gifts did not work. Two thousand Samsung employees at a phone manufacturing factory south of Seoul were instructed to don headbands marked “quality first” and gather outside. Thousands of phones and other electronic devices — with an estimated total value of $50m — were incinerated on a bonfire and the ashes were pulverised by a bulldozer.

As I blogged earlier, Samsung's spectacular success breaks the mould on several scared tenets of modern business organisation and management techniques. See this from The Economist.

3. Chandra Nuthalapati et al have a good study that informs significant gains for vegetable farmers from selling directly to supermarkets,

Even after controlling for differences in quality and other relevant factors, we found that imputed farmgate prices that farmers receive in supermarket channels are around 20% higher than the prices received in traditional channels for most of the vegetables considered. For some of the vegetables, price differences are even higher. We also found that selling to supermarkets involves lower transaction costs for farmers than selling in traditional markets, as supermarket collection centers are located closer to the villages and involve lower commission fees Higher prices seem to be needed as an incentive for farmers to deliver to supermarket collection centers, because supermarkets do not offer any other incentives to farmers. In other countries, where supermarkets often procure vegetables from farmers through contracts, farmers benefit from lower price risk or from inputs and extension provided as part of the contracts. In India, supermarkets procure vegetables without contracts, so that higher mean prices are important to ensure regular supplies. We found significant price incentives for comparable qualities. In addition, higher quality grades are rewarded in supermarket channels, which is often not the case in traditional channels. Our data showed that farmers who supply supermarkets typically sell their highest-quality vegetables in supermarket collection centers, whereas they sell lower-quality produce in traditional markets.

While this will surely have some positive effect, these are excessively big effects. Something going on here about the study.  

4. Bihar sugar mill industry fact of the day,

Around 1980, Bihar accounted for 30% of the country’s sugar production, and 28 functional sugar mills. It has now come down to less than 5% of the production, and has 10 mills... At the end of 2016-17, only about 2,900 of Bihar’s estimated 3,531 factories were operational, employing on an average 40 people each. The national average is nearly double, 77 workers. The average salary per annum per worker in Bihar then was Rs 1.2 lakh, again less than half of the national average of Rs 2.5 lakh.
5. FT has a long read on the emerging geo-political struggle in the Middle East between UAE and Turkey, motivated by ambitions in both countries to influence politics in other countries across the region. Their frontline is in Libya, where Turkey is supporting the UN-backed government and UAE is supporting the rebels led by Gen Khalifa Haftar. 
The UAE accuses Mr Erdogan of colonial delusions, supporting Islamist groups and forming a hostile axis with Qatar, its Gulf rival. The belief in Abu Dhabi is that wealthy Qatar provides the funding, and Turkey the muscle as Mr Erdogan seeks to position himself as a leader of the Sunni Muslim world. “Turkey has many things to answer for, with its long-term attempts — in concert with Qatar and the Muslim Brotherhood — to sow chaos in the Arab world, while using an aggressive and perverted interpretation of Islam as cover,” Anwar Gargash, the UAE’s minister of state for foreign affairs, wrote in the French magazine Le Point in June as tensions over Libya soared. Sheikh Mohammed, known colloquially as MBZ, is spearheading the Arab push against Turkey’s influence... The UAE, which has an indigenous population of just 1.5m but is one of the region’s wealthiest countries, has long punched above its weight. Since the 2011 Arab uprisings rocked the region, Abu Dhabi has deployed tens of billions of petrodollars to bolster allies across the Middle East and Africa through trade, aid and the use of military resources. The Gulf state’s foreign investment and bilateral aid to eight countries including Egypt, Pakistan and Ethiopia, has totalled at least $87.6bn since 2011, according to the American Enterprise Institute, which analysed publicly available data.

Turkey is today the hub for the region's dissidents, especially Islamists, who pose an existential threat to the monarchical autocracies. UAE's normalisation of relations with Israel should be seen in this backdrop - an attempt to ingratiate itself in the West, against Turkey.

A related issue is the intensification of the stand-off between Armenia and Azerbaijan over the Armenian enclave of Nogorno Karabakh in Azerbaijan. One important reason for the breakage of the Russia-brokered truce which has held since 1994 has been Erdogan and Turkey, which have aggressively armed and supported Azerbaijan, thereby emboldening it. A humanitarian disaster is now unfolding which has displaced nearly half of the enclave's population. 

6. From Ananth, this article by Norman Doidge on the problems with RCTs in medicine,

An important review of RCTs found that 71.2% were not representative of what patients are actually like in real-world clinical practice, and many of the patients studied were less sick than real-world patients. That, combined with the fact that many of the so-called finest RCTs, in the most respected and cited journals, can’t be replicated 35% of the time when their raw data is turned over to another group that is asked to reconfirm the findings, shows that in practice they are far from perfect. That finding—that something as simple as the reanalysis of the numbers and measurements in the study can’t be replicated—doesn’t even begin to deal with other potential problems in the studies: Did the author ask the right questions, collect appropriate data, have reliable tests, diagnose patients properly, use the proper medication dose, for long enough, and were their enough patients in it? And did they, as do so many RCTs, exclude the most typical and the sickest patients?

7. The reality with Uber's misleading minimum wage adherence claim.

Drivers will be guaranteed earnings — 120 per cent of the local minimum wage — though with a significant caveat: Uber won’t count the time drivers are waiting to be matched with a passenger. When you factor in that period, a Berkeley study suggests that Uber’s promised $15.60 minimum an hour instead becomes, on average, just $5.64, once adjusted for driver expenses such as fuel.

8. This shocking story of the flight of ABC's Beijing Correspondent from China tells everything about today's China, which clearly does not abide by any rules applicable for civilised nations.  

9. A rare example of expose of corruption in the defence forces, which is without doubt at least a pervasive as elsewhere (perhaps even more given the lack of external oversight). The problem though with dragging CBI, CVC etc into investigating works, especially those done in places like Ladakh during the ongoing stand-off, is that it could backfire badly and end up delaying and derailing even those critical and time-bound works. 

10. Talking of burying your head in the sand, and Eugene Fama, in this interview, is a great exhibit. The level of obduracy on financial markets, negative rates, private debt, impact of central bank policies, business concentration and so on is stunning. Virtually every paragraph is an exercise in denial of reality. Evidently Fama is living in a different world. 

11. Economist hails Aditya Puri as the world's best banker!

Image
The attributes are very old-fashioned,
First, Mr Puri’s management style, which features a clear vision, microscopic attention to detail, blunt speaking and a knack for retaining talent... The second factor is strategic discipline. Mr Puri intuited that Indian consumers and firms would be a consistent money-maker and has stuck to that view. He took the sophisticated processes used by foreign banks and used them to target local retail and commercial clients. The result is a large branch network, half of which is outside cities. The firm’s cash-machine and credit-card networks are the largest among India’s private banks. Mr Puri stayed away from foreign ventures and investment projects, avoided lending to India’s indebted oligarchs, and financed HDFC’s balance-sheet through deposits rather than debt... The final element is HDFC's approach to technology—though not a pioneer, it is a fast follower.

12. A Livemint story of the PLI scheme for mobile phone manufacturing, which has a five year allocation of Rs 41,000 Cr. This about the success of the segment as well as the distance to be travelled, 

India had two mobile manufacturing units in 2014. By 2019, there were over 200. The number of mobile handsets produced shot up from 60 to 290 million in the same period; the value of handsets produced jumped 10 times to $30 billion... China exported phones worth over $100 billion in 2019; Vietnam over $35 billion. India exported less than $3 billion in 2018-19.

Even with the PLIs, India stays below Vietnam and China on cost-competitiveness,

Assuming that $100 is the cost of producing a phone without subsidies, China can make it at $80 after factoring in the incentives the country provides. Similarly, the cost of manufacturing a phone in Vietnam. The PLI scheme bridges some of India’s deficit. The manufacturing cost, after factoring in PLI and other subsidies, totals $92-$93.

Interesting thing about the extent of subsidy, which is very significant,

The scheme is also a massive discount on India’s current value-add, the advisor mentioned above explained. Manufacturers in India import most of the components and the assembly value ranges between 8% and 15%. “If 15% is the assembly price, an incentive of 6% is almost a 50% discount," he said.

These are very instructive numbers. If even with assembly, India is not able to compete with Vietnam and China, that's disturbing. But perhaps, this underscores the need to localise component production to become competitive. That will hopefully happen in due course and the PLI scheme will expedite. But till then, the incentive is a massive subsidy cost being incurred. If it does not catalyse component manufacturing, then this can just as well be described as a corporate freebie.

13. The IPO of Ant Financial to raise about $35 billion, the world's largest ever, has attracted a staggering $2.8 trillion of orders from more than 5 million individuals, a sum which exceeds the value of all stocks listed on exchanges in Germany or Canada. For retail investors, the simultaneous listing at Shanghai and Hong Kong was oversubscribed more than 870 times. The company has a billion users and more than $17 trillion in yearly payment volumes.

Image

14. Gillian Tett points to the alarmingly low CDS recovery rate projects with the recent corporate bond auctions. 

Most CDS contracts stipulate that financiers need to know what a company’s cheapest available bond will be worth at the point the company defaults. That’s because CDS contracts make investors whole by paying them the bond’s original face value minus its market value. When a company goes bust, financiers hold an auction to determine the market price, and the resulting prices offer one guide to what creditors think the company’s remaining assets are worth. Over the past decade, the average CDS auction prices have moved in a band between 10 and 60 cents on the dollar, but have generally been between 30 and 40 cents. However the nine US auctions conducted in the year to August produced an average price of just 9 cents — and just 2.4 cents if you look at the worst four: Chesapeake, California Resources, Neiman Marcus Group, and McClatchy.

Worsening matters, bondholders are being continuously shortchanged, 

And because loans take priority over bonds in a bankruptcy, the practice has also weakened bondholders’ claims, sparking fights in some bankruptcies... Bondholders’ claims have been further undermined by debt exchanges and stealthy asset transfers, including one known as the “J-Crew trap door”. Named after the recently bankrupted US retailer, it refers to a manoeuvre pulled off by the company’s private equity owners in 2016 in which they transferred intellectual property rights across to new lenders, out of the reach of the original creditors. Similar tactics have emerged at other troubled groups such as Travelport.

And all this is being driven by the search for yield among investors,

Indeed, four-fifths of US loans issued last year were “covenant-lite”, that is they had little or no control over borrower behaviour, up from one-fifth at the start of the decade. That is because investors are so desperate to chase returns in a zero-rate world that they no longer dare to impose covenants. Indeed, the hunt for returns is so frenzied that junk bond yields have plunged from 12 per cent in March to below 6 per cent. Cheap money, in other words, is enabling some zombie companies to stagger on, even as creditor value shrivels — until they collapse.

15. Fascinating article about the QR Code, the low-profile but functionally valuable invention in 1994 by Masahiro Hara to track components in car factories. Its use took off with its adoption by Ant Financial to make mobile payments through Alipay, and has not looked back. It was the crucial link which enabled the use of mobile phones for digital payments. It's now being used for everything from digital payments to browsing dinner menus online. 

Mr Hara worked at Denso Wave, part of a components group allied to Toyota, which used barcodes to label components in plants. But the barcode, first used in an Ohio supermarket in 1974, could be hard to use — as anyone who has tried to scan a bag of frozen peas will know — and did not hold much information. He solved the data constraint by making the QR code a two-dimensional square instead of a horizontal strip, allowing it to store up to 4,200 characters compared to 20 on the barcode. His team also conquered the time-consuming awkwardness of barcodes — every QR code includes three squares at its corners that help scanners to focus rapidly (hence, quick response). Japanese carmakers found it very useful: it saved some workers from having to scan up to 1,000 barcodes a day. 

This is one more to the point I've been making that Alibaba is a more entrepreneurial e-commerce engine than Amazon,

The QR code enabled Ant to pioneer mobile payments in China through its Alipay super app. The renaissance of QR codes, after years of half-baked efforts by US advertisers and retailers to use them for marketing campaigns and shopping vouchers, shows that it takes time for the strengths of some inventions to emerge.

And this is interesting, an illustration of how non-patenting of such general purpose ideas can have large positive externalities,

But Denso Wave realised that the QR code had greater potential and did not enforce its patent rights. That enabled others not only to use it free but make variations for their industries. The invention knocked around for a decade without finding another compelling use until Alibaba, the Chinese ecommerce group co-founded by Jack Ma, realised it could be used for payments. Shopping in the US and Europe, both online and in stores, is mostly done with payment cards, but the QR code offered an alternative.

It was the industry's good fortune that the QR Code was not invented in the US by the likes of Apple, who would have immediately patented it. 

16. A summary of the changes incorporated in the regulations proposed to implement the new labour codes in India. 

Monday, October 5, 2020

Shifting the focus on public services delivery away from efficiency towards quality

This (also this) is an RCT study on Aadhaar-based biometric authentication (ABBA) of PDS beneficiaries using electronic Point of Sale (PoS) terminals and the reconciliation of the foodgrain delivered to PDS shops with the data from PoS terminals. The study found "large reductions in leakage, but also significant reductions in benefits received", mostly arising from the reconciliation part. 

In an excellent illustration of both the limitations of economic research methodologies and the cavalier manner in which rigorous research is often packaged and disseminated, Jean Dreze, Reetika Khera and Anmol Somanchi write,
Suppose a PDS dealer receives 20 kgs of rice from the government every month, to be distributed to Savitri Devi. If Savitri does not get her rice (for whatever reason) in a particular month, this will show – hopefully – in the ABBA-generated transaction records. In that case, 20 kgs can be deducted from the dealer’s rice allocation in the following month, effectively preventing him (or so it seems) from siphoning off Savitri’s ration. In short, reconciliation helps to curb bogus transactions. Reconciliation sounds like a good idea, especially in a PowerPoint presentation.
In practice, it poses multiple challenges. Just to invoke one, again with an example, consider Olasi, a widow living alone for whom ABBA does not work, perhaps due to rough fingerprints. Before reconciliation, the dealer used to give her rice nevertheless – after all, he was not paying for it. Is he likely to continue giving rice to Olasi after reconciliation? You guessed it. And what if Savitri’s dealer tries to cope with reconciliation by taking a 5-kg cut from Savitri’s monthly ration for a few months? We know that this sort of ‘adaptive corruption’ happened in Jharkhand post-reconciliation, not only blunting the reform but also leading to further exclusion in many cases. As these examples illustrate, reconciliation is not exactly a surefire policy. In particular, it requires an exacting level of preparedness – high performance of ABBA, complete transaction records, an efficient supply chain, cooperation from dealers, and more. We submit that the PDS in Jharkhand was nowhere near the required level of preparedness in July 2017.
Their conclusion is very important,
An RCT is not a guarantee of objectivity. However ‘rigorous’ the evidence may be, it still needs to be interpreted, summarised, and conveyed. It is quite easy for the evidence to get distorted or embellished in this communication process.
To drive home their point, the authors even write down what should have been the actual abstract of the RCT study. The original abstract and that written by Dreze et al conveys it all. 

Dreze and Co highlight two things. One, how superficial understanding of context leads to misleading interpretations and conclusions. Two, how researchers spin their experiments to propagate misleading conclusions. This is another similar example I blogged earlier about. 

In the context of this debate, I am struck by the proliferation of studies that focus on leakage reductions and efficiency improvements on government programs. The studies add to the narrative of pervasive leakages and the utility of technology in its reduction. The narrative gets more entrenched.

This narrative completely misses the objective of the program itself - ensuring poor people access to food security, through the PDS. Technology is only a means to achieve the objective, though in the process, it can reduce leakages. In other words, the primary role of technology should be to help meet the program objectives, with leakage reduction being only a by-product. 

For different reasons, the institutional incentives of bureaucrats have over the years come to be aligned towards efficiency improvements and leakage reductions. Awards and other incentives are aimed at performance that highlight leakage reductions, and not on efforts that maximise coverage of the eligible (minimise exclusion errors). In fact, nobody is even talking about these aspects. 

As a research agenda, someone should analyse the Prime Minister's Excellence Awards and see what proportion of them are best practices that involve use of technology and are focused on efficiency, as against those that involve non-technology interventions and focused on quality of service delivery. As another research agenda, researchers should supplement the works of people like Dreze and focus on highlighting the access and quality deficiencies with welfare programs and interventions undertaken to address them. 

Undoubtedly things were once very bad. Fortunately, even without (before) RCTs and technologies, things have improved significantly. Egregious forms of corruption and inefficiencies are becoming increasingly marginal and are now confined to a few isolated pockets. This is true of attendance among government officials like teachers or doctors, leakages from in-kind and cash benefit programs, ghost engineering works, and so on.  

The egregious nature of the inefficiencies meant that administrative focus was often prioritised on the efficiency dimension - ensuring attendance, limiting inclusion errors and pilferages, and dimensions of roads and irrigation canals or numbers of toilets constructed, and so on. It helped that all these are easily quantifiable and measurable. And technology makes it easier still. Anyways, this was all about getting to the starting line. 

Having largely gotten to the starting line, it's now time to refocus attention on the original objectives of programs - whether they are delivering on objectives, the deficiencies, and its reasons. This means a different set of priorities focused more on quality - student learning outcomes and quality of medical care administered; realisation of food security and employment guarantee, especially to the poorest; quality of the buildings, roads, and canals; utilisation of toilets and so on. 

This poses a set of problems of assessment - quantification and measurement, and doing them in a non-burdensome and credible enough manner. Unlike the metrics of efficiency, those of quality are not easily measured. In fact, some of them cannot be measured at all, especially for meaningful real-time monitoring at scale, and we need to rely on proxy indicators. In most areas, even all the wonderful technologies fail to contribute anything significant in measuring quality issues.

In fact, I cannot not feel that Aadhaar and IT were perhaps atleast a decade late in arrival in being meaningful in addressing the efficiency objective. And while useful, they are marginal contributors in the larger scheme of things in realising the objectives. Aadhaar and IT have their uses at several other places, but these are not exactly the areas where they add much value.  

It's like the difference in approach between moving a system from bad to average, and average to good. The former demands focus on some low-hanging efficiency improvements (even at the cost of objectives and quality) and the latter has to prioritise quality and objectives. And that is much harder. 

This applies as much to many other areas of development. For example, consider the narrative on agriculture. It is now excessively focused on the rapacious middleman exploiting the weak farmer. Accordingly, the idea of efficiency enhancing stuff like deregulation and technology innovations (e-NAM) becomes very appealing. But we now know that the maximum gains that can be squeezed out by farmers even by complete elimination of middlemen and direct engagement with buyers is less than 10% of crop value. It has not been technology or innovations, but just better governance and general development that have minimised the egregious rent-seeking and exploitation by middlemen.  

The need now is to focus on the basic objective of getting better prices for farmers and connecting them directly to impersonal markets. This in turn demands things like capacity to sort and grade produce, and adherence to contractual obligations by all sides. These are less areas for technology-based reforms and progress measurements. 

I think it is important to rebalance the focus of public services delivery and development in general. The pendulum appears to have swung excessively towards improving efficiency and reducing costs and away from quality of service delivery and meeting program objectives (or issues of resilience and fairness). That prioritisation was important for a bygone time when program delivery was very inefficient. I have blogged earlier about how the excessive focus on efficiency distorts perspectives

For researchers, this reframing of narratives and rebalancing of priorities involves moving away from quantitative and RCT-focused methodologies to mixed methods which combine appropriate quantitative and qualitative approaches.

It is important for researchers to be cognisant that their research priorities are actually creating more damage and distortions than contributing to any improvements. Their headline grabbing inclinations and research are reinforcing an entrenched narrative which is well past its sell-by date. 

It is easy and sexy for bureaucrats to pursue leakage reductions. Research on leakage reductions is easily measurable and amenable to RCTs and the likes. In this, both researchers and implementers have become captives of technologies. Technology makes it easier for both to pursue their respective agendas. 

There is a much broader lesson here. Narratives, which were relevant for a period, often endure long after the underlying conditions change. There is a very high degree of hysteresis with the process of Bayesian updations among researchers. The persistence then starts to cause damage. One example is that of the enduring narrative that countries should continue with trade liberalisation, despite the very low current baseline of tariffs. But Dani Rodrik and others have shown that we are well into diminishing and even negative returns from any further reductions in tariffs. 

Of course, I am not even talking about misleading dissemination of research, a more serious charge laid by Dreze and co-authors.

Update 1 (28.11.2020)

Sonalde Desai and Pallavi Choudhuri write about the extent of exclusion errors from digital transfers,
Recipients of PM KISAN were not among the poorest households, nor were these the households most affected by the Covid 19 lockdown. Data from round-3 of NCAER Delhi Coronovirus Telephone Survey (DCVTS - 3) covering a sample of 3466 households in June in the Delhi NCR region, suggests that 21% of farm households received transfers through PM KISAN. However 42% of those households belonged to the wealthiest one-third of the sample, while another 28.5% belonged to the middle one-third. 

Monday, April 13, 2020

A reality check on economists as plumbers

I have already written about economists as plumbers here, here, here, here, and here. Economists are plumbers only to the extent that instructors of plumbing courses in a polytechnic can be trusted with actual plumbing for your house.

In her now famous Economist as Plumber speech delivered at the annual AEA meeting in January 2017, Esther Duflo posited her own straw man case and scorned at the officials of the Health Department of the Government of Kerala, 
Their best idea is to reform the public primary health care system in order to make it more attractive to customers (who for the most part seek their health care in the private sector, like elsewhere in India), and to instate better practices of prevention, management, and treatment. They would try out a new organization of the health care sector. Nurses, volunteers from the local governments, and doctors would work seamlessly in a health team that would be in charge of keeping the population healthy, with a heavy focus on encouraging lifestyle changes and preventative activities. They are currently planning to try the system in 152 health centers. 
Though the Additional Chief Secretary (top bureaucrat) in charge of health had invited us to the meeting, he was called away to deal with a doctors’ strike around the time the conversation turned to the specifics of the reform. He handed us over to a retired professor and a retired doctor, who have been charged with designing the specifics of the policy. This in itself is symptomatic: top policy makers usually have absolutely no time for figuring out the details of a policy plan, and delegate it to “experts.” In our conversation, we started to push on some specific questions on the model that they had in mind: why would patients pay attention to a nurse, given that until now they have only taken doctors seriously? Were they really sure that if the nurse started to take blood pressure and fill prescriptions, this would give her the authority she would need to dispense advice? Or that doctors would be willing and able to signal that nurses were to be respected, in a system that has always been heavily hierarchical? For that matter, did the planners really think it was going to be possible for health care professionals to spend a lot more time on public health and prevention when there were only two doctors for every 30,000 people?
What was striking was, not only did they not have any answer to these questions, but they showed no real interest in even entertaining them. Whenever we asked them to spell out what they thought their policy lever was (as opposed to their aspiration), the stock answer was that they did not really have one, that the local governments and medical officers could not be forced to do anything. May be a village committee would need to decide to organize yoga classes, but another one would not, so there was really no way to find out what really worked. This was entirely beside the point, since the presence (or absence) yoga class would have been an outcome of what they could do at the central government (train local governments in the fundamental of public health for example). But they seemed to have no understanding of a causal chain going from policy design to implementation and final outcome. Their position oscillated between presenting the illusion of the perfect system, and presenting the illusion of complete powerlessness in the face of local power and initiative.
We tried, and failed, to engage them on the details of policy. Not only did they have no understanding of plumbing issues, there was not even a realization that plumbing was an issue at all. When the top bureaucrat popped in and was appraised of the conversation it was decided that we would be shown some details. We went away to another meeting and came back after three hours. They had set up the projector, a sign that things were about to get serious. They displayed a power point with each of the new UN Sustainable Development Goals, and a list of proposals to achieve them in Kerala. These amounted to a long, meritorious, and likely totally vacuous, wish list (30 minutes of exercise per day mandatory in all schools, awareness of obesity to be built in communities, etc.). Interest- ingly enough, it had nothing to do with the health care reform that we had discussed that morning. It appeared that the details would have to wait for another day.

I have since learned to avoid these kinds of meetings in general, but this encounter reminded me of my early days as a plumber. It turns out that most policy makers, and most bureaucrats, are not very good plumbers.
Now this is what the actors in the story had to clarify,
Former additional chief secretary (health) Rajeev Sadanandan, who invited Banerjee and Duflo to Kerala and who in Duflo's paper is simply called 'top bureaucrat', said there could be other reasons why Duflo was upset. “It is not as if the officials here did not understand what she and others asked. It is just that local level planners felt that their technique was not suitable to measure the outcomes of Aardram,” Rajeev said. In short, there were ideological differences. “And Duflo and Banerjee were told that Kerala cannot adopt their methodology to assess the outcome of Aardram. They probably would have felt bad"... “She does not know how Kerala functions. Moreover, the people who interacted with Banerjee, Duflo and Gita Gopinath were part of the team that evolved the Aardram Mission,” he said. On their part, the Kerala team was not particularly enthused by Banerjee's and Duflo's randomised trial method. They felt it was a bit too rigid, and removed from reality. “The question was is it possible to break health into discrete activities whose effects can be segregated from related factors and studied separately,” Rajeev said...
Dr B Ekbal, a neurologist and Planning Board member... feels that Duflo had misunderstood the Aardram Mission. She seems to be under the impression that nurses would function as quasi doctors under the mission... Dr Ekbal said at no stage in the planning of Aardram Mission had the planners thought of asking nurses to take over even the least important functions of doctors, leave alone filling prescriptions. “At the most they will do a preliminary assessment of the patient, and even this is usual practice. Nonetheless, we have increased the number of nurses to four in a family health centre,” Dr Ekbal said... (On SDGs) “It is strange she called our goals vacuous when some of them included bringing down infant mortality rate to 8 from 12 and neonatal mortality from 7 to 5 by 2020,” Dr Ekbal said.
Needless to say, while the speech went viral (to further entrench a self-serving narrative), the clarification hardly got any mention.

Clearly the plumbers understood neither the government program nor the context. Instead she constructed a straw man stereotype to caricature government officials, one which tarred everyone with the same brush, and reinforced an entrenched narrative to a completely disconnected academic audience, and one which presented the compelling raison d'etre for their own active engagement with public policy. It was a classic make-believe world!

Actually, the four paragraphs quoted above have several other factually incorrect things, leave aside the deeply questionable subjective conclusions. The speech itself, especially on the other India examples, is littered with similar or graver problems. 

Never mind that the actors and the context is about arguably one of the more impressive global development success in general and especially in health sector. Further, the derisively described officials themselves are all among the most respected and deeply experienced professionals. 

Anyways, fast forward to the times of Covid 19. Pretty much the same set of actors within the same Kerala government, using exactly the derided decentralised model and approach, are today globally acclaimed for their impressive work. This has several links to stories documenting the success till date of the Kerala model of Covid response. And this from the Washington Post,
Even though Kerala was the first state to report a coronavirus case in late January, the number of new cases in the first week of April dropped 30 percent from the previous week... The success in Kerala could prove instructive for the Indian government, which has largely shut down the country to stop the spread of the contagion... The state faced a potentially disastrous challenge: a disproportionately high number of foreign arrivals... Its challenges are plenty — from high population density to poor health care facilities — but experts say Kerala’s proactive measures like early detection and broad social support measures could serve as a model for the rest of the country... Kerala’s approach was effective because it was “both strict and humane,” said Shahid Jameel, a virologist and infectious disease expert... “Aggressive testing, isolating, tracing and treating — those are ways of containing an outbreak,” said Jameel, who is also the CEO of Wellcome Trust, a health research foundation. Henk Bekedam, the World Health Organization’s representative in India, attributed Kerala’s “prompt response” to its past “experience and investment” in emergency preparedness and pointed to measures such as district monitoring, risk communication and community engagement. 
Some of the things in the article itself is exaggerated - for example, Kerala did not do any mass testing, but only followed the standard Government of India protocols - symptomatic people with travel or contact history, health workers with symptoms, all hospitalised patients with severe acute respiratory illness (fever, cough etc), and asymptomatic direct and high-risk contacts of a confirmed patient (once between days 5 and 14 of having come in contact). 

Incidentally, when the same set of officials come up with this lockdown exit strategy (it is a 36 page detailed plumbing document), which is perhaps the only document of its kind available now anywhere in the world, and has become the reference document for others, what do the plumbers come up with? Sample the "nine steps" prescription,
First, try to make sure that every household has at least one person who knows the key symptoms of the disease. Second, spread the awareness that some people will get infected despite their best efforts; we want to avoid ostracism and concealment. Honest reporting is key. Third, offer multiple ways to report; a hotline, the ANM, the Asha, etc. Fourth, consider training the rural health practitioners (including the unqualified) in the detection of those symptoms and reporting them to the relevant authorities. Fifth, make sure that those reports are collated quickly so that we know where the new hotspots might be and more generally, focus on coordination of the evidence from across the country so that broad trends can be identified. Sixth, in each state create a large mobile team of health professionals, doctors and nurses, with testing kits and, ideally, ventilators and other equipment. The idea is a part of this team will be quickly deployed wherever the number of reports seems to be growing fast (including in nearby areas in other states)... Seventh, to build this team and make sure it has access to the necessary equipment, require all health professionals (and not just those who work for the government) to be available for call up where needed and give the teams the right to make use of all hospitals, private and public, as needed. Eighth, be much, much bolder with the social transfers schemes... Finally, be prepared to continue this “war effort” until the vaccine comes on line. Then vaccinate as many people as possible. And start to upgrade the healthcare system — let us be better prepared for the next time.
I'll not comment on this, but leave it to perceptive readers to make sense of it. Except to say this - the median official in any district could have come up with something much more practical, relevant, scalably actionable, and likely to be effective. 

If only real plumbing was as easy as it is misunderstood to be!

It is indeed something to introspect that, instead of scaremongering with models and peddling mass lockdowns and impossibilities like mass testing, none of the experts have anything constructive to offer by way of  practical and actionable suggestions to governments on calibrated exits, social distancing, management of migration, more optimal testing protocols, re-starting of business activities, managing epidemic relapses, and so on.

Interestingly, the role of experts in the US has been much more constructive and relevant. Sample this summary of the choices available and the debate surrounding them. When will the international development plumbers think about surfacing such debates for developing countries?

When faced with real world problems and solving them in real time, the "bureaucrat", "retired professor", and "retired doctor", along with several others, have done a remarkable job, one which is a model for others, and not just in India, to emulate. 

Ultimately, plumbing is about getting stuff done. And that requires people who have experience of doing stuff, not mere knowledge of concepts and frameworks. As mentioned in the beginning, teaching plumbing courses is not enough to make good plumbers!

Update 1 (18.04.2020)

Another example of plumbers just cutting and pasting which are already being done, many since the beginning in many districts, and passing off as novel suggestions. And what have not been done have a practical reason for not being done. If only Covid 19 fighting had such low-hanging fruits!